Immune Repertoire Sequence Amplification Methods and Applications

ABSTRACT

The present invention relates generally to the field of immune binding proteins and method for obtaining immune binding proteins from genomic or other sources. The present invention also relates to nucleic acids encoding the immune binding proteins in which the natural multimeric association of chains is maintained in the nucleic acids and the immune binding proteins made therefrom. For example nucleic acids encoding antibodies that are amplified from a B-cell using the methods of the invention maintain the natural pairing of heavy and light chains from the B-cell. This maintenance of pairing (or multimerization) produces libraries and/or repertoires of immune binding proteins that are enriched for useful binding molecules.

This application claims priority to provisional application Ser. No.62/395,241 filed Sep. 15, 2016.

FIELD OF THE INVENTION

The present invention relates generally to the field of immune bindingproteins and methods for obtaining immune binding proteins from genomicor other sources. The present invention also relates to nucleic acidsencoding the immune binding proteins in which the natural multimericassociation of chains is maintained in the nucleic acids and the immunebinding proteins made therefrom. For example nucleic acids encodingantibodies that are amplified from a B-cell using the methods of theinvention maintain the natural pairing of heavy and light chains fromthe B-cell. This maintenance of pairing (or multimerization) produceslibraries and/or repertoires of immune binding proteins that areenriched for useful binding molecules.

BACKGROUND OF THE INVENTION

There is considerable interest in being able to discover antibodies tospecific antigens. Such antibodies are useful as research tools and fordiagnostic and therapeutic applications. However, the identification ofsuch useful antibodies is difficult and once identified, theseantibodies often require considerable redesign before they are suitablefor therapeutic applications in humans.

Many methods for identifying antibodies involve display of antibodylibraries derived by amplification of nucleic acids from B cells orother tissues. These approaches have limitations that limit the usefulantibodies obtained from the library. For example, most antibodylibraries do not pair the heavy and light chains obtained from memoryB-cells or plasma cells that have mounted an effective immune responseagainst an immunological challenge. In addition, most human antibodylibraries known contain only the antibody sequence diversity that can beexperimentally captured or cloned from a biological source (e.g., Bcells). Accordingly, such libraries may over-represent some sequences,while completely lacking or under-representing other sequencesespecially paired light and heavy chains that form useful antibodies,particularly those from a successful immune response.

It is an object of this invention to provide libraries of immune bindingproteins that are enriched for useful immune binding proteins. It isalso an object of the invention to provide methods for making suchlibraries that are enriched for useful multimers of immune bindingproteins. It is a further object of the invention to provide methods foramplifying nucleic acids from B-cells and plasma cells so that thepairing of light and heavy chains is maintained. It is an object of theinvention to obtain libraries of antibodies relevant to diseasetherapies by obtaining paired light and heavy chain antibodies fromindividuals whom have mounted antibody responses against a variety ofimmunologic challenges related to, for example, infectious diseases,cancer, auto-immune disease, neurodegenerative disease, and allergies.

SUMMARY OF THE INVENTION

The invention relates to nucleic acids encoding immune binding proteinsthat preserve the in vivo multimeric associations of the immunepolypeptide chains making up the immune binding protein (e.g.,antibodies, T-lymphocyte receptors, or innate immunity receptors). Insome embodiments, the invention relates to immune binding proteinlibraries that are enriched for nucleic acids encoding multimers thatfunctionally represent the multimeric complexes found in the cells fromwhich the immune binding protein library was obtained. In someembodiments, the nucleic acids encoding the polypeptide chains forimmune binding proteins are derived from individuals whom have mountedan immune response relevant to, for example, an infectious disease, acancer, an autoimmune disease, an allergy, or a neurodegenerativedisease. In some embodiments, the infectious disease is caused by aninfluenza virus. In some embodiments, the infectious disease is causedby a virus such as, for example, HIV, Ebola, Zika, HSV, RSV, or CMV. Insome embodiments, the cancer is a melanoma. In some embodiments, thecancer is one that responds to immunotherapy.

The invention relates to nucleic acids encoding polypeptide chains forimmune binding proteins of the invention (e.g., light and heavy chainantibody polypeptides) that preserve the in vivo functional pairing ofthe polypeptide chains (e.g., light and heavy chains of an antibody). Insome embodiments, the immune binding protein libraries of the inventionare enriched for functional multimers of nucleic acids encoding thepolypeptide chains that make up the immune binding protein (e.g., lightand heavy chains of an antibody) and which were associated together inthe repertoire from which the immune binding protein library wasobtained. In some embodiments, the nucleic acids encoding associatedpolypeptide chains for the immune binding protein (e.g., paired lightand heavy chains) are derived from individuals whom have mounted animmune response relevant to, for example, an infectious disease, acancer, an autoimmune disease, an allergy, or a neurodegenerativedisease. In some embodiments, the infectious disease is caused by aninfluenza virus. In some embodiments, the infectious disease is causedby a virus such as, for example, HIV, Ebola, Zika, HSV, RSV, or CMV.

In some embodiments, the invention relates to a plurality of nucleicacids comprising a plurality of polynucleotides encoding a first chainof a multimeric immune binding protein, a plurality of polynucleotidesencoding a second chain of a multimeric immune binding protein, whereineach polynucleotide encoding the first chain of the multimeric immunebinding protein is paired with the polynucleotide encoding the secondchain of the immune binding protein to form a plurality of pairs ofpolynucleotides encoding the first chain and the second chain, whereinthe plurality of pairs of polynucleotides represent a plurality of pairsof first chains and second chains as they are found in a plurality ofhost cells from which the multimeric immune binding proteins arederived. In some embodiments, the multimeric immune binding protein isan antibody, a T-cell receptor or an innate immunity receptor. In someembodiments, the antibody is a scFv, a Fab, a F(ab′)₂, a Fab′, a Fv, ora diabody. In some embodiments, the antibody is an IgG, an IgM, an IgA,an IgD, or an IgE. In some embodiments, the antibody is from a B-cell, aplasma cell, a B memory cell, a pre-B-cell or a progenitor B-cell. Insome embodiments, the T-cell receptor is a single chain T-cell receptor.In some embodiments, the T-cell receptor is from a CD8+ T-cell, a CD4+T-cell, a regulatory T-cell, a memory T-cell, a helper T-cell, or acytotoxic T-cell. In some embodiments, the multimeric immune bindingprotein is from a natural killer cell, a macrophage, a monocyte, or adendritic cell.

In some embodiments, individual cells containing nucleic acids encodingthe immune binding proteins are placed into microwells and/or anemulsion. In some embodiments, primers for the forward (F) and reverse(R) directions of the nucleic acids encoding the polypeptides for theimmune binding protein (e.g., antibody heavy (H) and light (L) chains)are introduced (e.g., HF, HR, LF, and LR), as well as a polymeraseenzyme and dNTPs to carry out template-directed amplification. In someembodiments, the F1 (e.g., HF) and R2 (e.g., LR) primers (oralternatively the LF and HR primers) contain an overlap extension region(OE) such that during cycled amplification these primers mutually extendeach other. In some embodiments, a joint polypeptide (such as a scFv)can be encoded by the amplified nucleic acids, the OE region can alsoencode an amino acid linker sequence (FIG. 1A). In an alternateembodiment, the amplified nucleic acids are used in a sequencingreaction and one or more of the primers can include a barcode region(e.g., BC1, BC2, BC3 and/or BC4) (FIG. 1B). In some embodiments, theamplification reaction is carried out, resulting in a nucleic acid whichcodes for the two polypeptide chains of the immune binding protein(e.g., both a heavy and a light chain of an antibody). In someembodiments, the nucleic acid obtained from each well and/or emulsion ishomogeneous and encodes the antibody made by the single cell placed inthe microwell and/or emulsion. In some embodiments, nucleic acidsobtained from the wells and/or emulsions are pooled to form a library ofimmune binding proteins (e.g., heavy/light chain pairs) that reflect theassociation of polypeptides (e.g., pairing of the antibody chains) fromthe source cells or genetic material.

In some embodiments, the resulting pool of nucleic acids encodingassociated polypeptides of the immune binding protein (e.g., pairedheavy and light chains for and antibody) are cloned into an expressionvector or can be processed for sequencing. In some embodiments, theexpression vector is engineered for phage display, yeast display, orother display technology. In some embodiments, the expression vector isfor secretion expression and recombinant production of the immunebinding protein. In some embodiments, the expression vector is formaking a library of chimeric antigen receptors, where each CAR has oneof the associated immune binding protein clones obtained from theamplification reaction. In some embodiments, primers corresponding toheavy chains or light chains may be targeted to single isotypes ofantibodies (e.g., IgG), or pools of primers corresponding to allavailable isotypes or some fraction thereof may be used.

In some embodiments, primers for the polypeptide chains of the immunebinding protein (e.g., light chain and heavy chains of an antibody) arelinked together so that each primer is capable of priming a reaction. Insome embodiments, a 5′ azide-alkyne reaction (“Click”) coupling canbring together the primers. In this embodiment, the dual primer isincubated with single cells in a well or emulsion, and nucleic acids areobtained where a nucleic acid encoding one polypeptide chain of theimmune binding protein is linked to a nucleic acid encoding theassociated polypeptide chain of the same immune binding protein (e.g., aheavy chain is linked to a nucleic acid encoding the paired lightchain). In some embodiments, a microsurface (e.g., bead or microwell) isprepared and contains primer sequences that are capable of bindingnucleic acids encoding multiple, associated polypeptides of the immunebinding protein (e.g., heavy and light chain nucleic acids). FollowingmRNA capture, cDNA synthesis or PCR from a single cell in a spatialconfinement with the primers in the well or on the bead, nucleic acidsencoding the associated polypeptide chains (e.g., paired heavy and lightchains) become co-located with the primers of the solid phase.

In some embodiments, nucleic acid probes for nucleic acids encodingassociated polypeptides of the immune binding protein (e.g., heavy andlight chain polypeptides) are placed on a solid surface. In thisembodiment, the probes for nucleic acids encoding associatedpolypeptides of the immune binding protein (e.g., heavy and light chainpolypeptides) are interrogated with nucleic acids, e.g., mRNA, from asingle cell. The probes on the solid phase will capture nucleic acidsencoding the associated polypeptides of the immune binding protein(e.g., heavy and light chain polypeptides) from the cell. In someembodiments, captured mRNA is reverse transcribed to make paired cDNAsencoding associated polypeptides of the immune binding protein (e.g.,heavy and light chain polypeptides) from a single cell.

In some embodiments, the nucleic acids encoding the subunits of theimmune binding protein are bar coded to enable identification of uniquemolecules. In some embodiments, a solid phase with a cell-specificbarcode is made with spatially confined PCR reactions of a plurality ofsingle template molecules containing a linker/adapter primer sequence, arandom barcode sequence, and a secondary primer sequence. In someembodiments, a limited dilution of template molecules is used, and thetemplate molecule is linked to a solid phase at very low loading ratesto ensure only a single molecule is available as a template at eachsite. In this embodiment, at least one of the primers in this PCRreaction should be attached to the solid phase. In some embodiments,additional molecules may be added to load additional sites, knowing thatpreviously bound sites are incapable of reacting because they wereexhausted during previous rounds of PCR. In some embodiments,oligonucleotides can be attached at an extremely low loading rate to asurface and beads are flowed over the surface to ensure that each beadbinds a single oligonucleotide. In some embodiments, beads are reflowedover the surface without being subjected to the constraints ofpoissonian loading. In some embodiments, each bound bead would beguaranteed to have one and only one template sequence. In someembodiments, each spatially confined site (either a position or well ona patterned surface, or bead in emulsion) will contain the same barcodedDNA in close proximity, whereas other sites will each contain separatebarcoded DNA in close proximity originating from other single moleculetemplates. In some embodiments, single stranded DNA can be generatedthrough the use of a 5′ nuclease or denaturation of the uncoupled secondstrand. In this embodiment, the secondary primer sequence is availableto perform a subsequent barcode extension reaction, or can be useddirectly to capture nucleic acids from single cells. In someembodiments, the bead can be ligated to a sequence containing a linkersection and a fully random sequence to serve as a unique molecularidentifier, and a tertiary primer sequence. In this embodiment, thetertiary primer sequence is available to perform a subsequent barcodeextension reaction, or can be used directly to capture nucleic acidsfrom single cells.

In some embodiments, the antigens are identified for the immune bindingproteins of the invention. In some embodiments, the nucleic acids of theinvention encode the subunits (or pairs) of the immune binding proteinand the antigen bound by the immune binding protein. In someembodiments, a three-way coupling between nucleic acids encodingassociated polypeptides of the immune binding protein (e.g., heavy andlight chain polypeptides), and an antigen that is barcoded with anantigen-specific sequence. In some embodiments, antibodies are displayedon the surface of a cell, probed with a population of barcoded antigens,and then the resulting conjugates can be encapsulated into a microwellor an emulsion, and sequence amplification methods are utilized torecover the sequence of the associated polypeptides of the immunebinding protein (e.g., heavy and light chain polypeptides) and thebarcoded antigen sequence. In some embodiments, a plurality of antigensare bar coded. The bar coded antigens can screened against immunebinding proteins to find the immune binding proteins that bind tospecific antigens. This screening can be done with immune bindingproteins from the libraries described herein, immune cells obtained froma subject who is naïve to the antigen, or immune cells obtained from asubject who has mounted a relevant immune response (e.g., an immuneresponse relevant to an infectious disease, a cancer, an autoimmunedisease, an allergy, or a neurodegenerative disease). The immune cellspaired with bar coded antigens can then be used in the amplificationmethods to obtain nucleic acids encoding immune binding proteins and theimmune binding proteins.

In some embodiments, the nucleic acids encoding the immune bindingproteins are sequenced. In some embodiments, the sequencing is done byhigh-throughput sequencing. In some embodiments, the sequenceinformation obtained is used for putative lineage information based onsequence alignment.

BRIEF DESCRIPTION OF THE FIGURES

FIG. 1A-C show depictions of primers used in the invention and nucleicacid products made by the invention. FIG. 1A shows a reaction usinglight chain forward (LF) and light chain reverse (LR) primers where theLR primer includes an overlap extension region (OE). FIG. 1A also showsheavy chain forward (HF) and heavy chain reverse (HR) primers where theHF primer also includes an overlap extension region (OE). FIG. 1B showsan embodiment where one or more of the primers include a bar code regionfor identification of the nucleic acid made by the primers. FIG. 1Cshows an embodiment where an antigen includes a nucleic acid thatidentifies the antigen, and which nucleic acid also has forward andreverse primers (antigen forward AF and antigen reverse AR) where the ARprimer has an overlap region that will correspond to an overlap regionof one of the heavy chain or light chain primers, e.g., the LF primercan include an OE that will correspond to the OE of the AR primer.

DETAILED DESCRIPTION OF THE INVENTION

Before the various embodiments are described, it is to be understoodthat the teachings of this disclosure are not limited to the particularembodiments described, and as such can, of course, vary. It is also tobe understood that the terminology used herein is for the purpose ofdescribing particular embodiments only, and is not intended to belimiting, since the scope of the present teachings will be limited onlyby the appended claims.

Unless defined otherwise, all technical and scientific terms used hereinhave the same meaning as commonly understood by one of ordinary skill inthe art to which this disclosure belongs. Although any methods andmaterials similar or equivalent to those described herein can also beused in the practice or testing of the present teachings, some exemplarymethods and materials are now described.

It must be noted that as used herein and in the appended claims, thesingular forms “a”, “an”, and “the” include plural referents unless thecontext clearly dictates otherwise. It is further noted that the claimscan be drafted to exclude any optional element. As such, this statementis intended to serve as antecedent basis for use of such exclusiveterminology as “solely,” “only” and the like in connection with therecitation of claim elements, or use of a “negative” limitation.Numerical limitations given with respect to concentrations or levels ofa substance are intended to be approximate, unless the context clearlydictates otherwise. Thus, where a concentration is indicated to be (forexample) 10 μg, it is intended that the concentration be understood tobe at least approximately or about 10 μg.

As will be apparent to those of skill in the art upon reading thisdisclosure, each of the individual embodiments described and illustratedherein has discrete components and features which can be readilyseparated from or combined with the features of any of the other severalembodiments without departing from the scope or spirit of the presentteachings. Any recited method can be carried out in the order of eventsrecited or in any other order which is logically possible.

Definitions

As used herein, an “antibody” refers to a protein functionally definedas a binding protein and structurally defined as comprising an aminoacid sequence that is recognized as being derived from the frameworkregion of an immunoglobulin encoding gene of an animal producingantibodies. An antibody can consist of one or more polypeptidessubstantially encoded by immunoglobulin genes or fragments ofimmunoglobulin genes. The recognized immunoglobulin genes include thekappa, lambda, alpha, gamma, delta, epsilon and mu constant regiongenes, as well as myriad immunoglobulin variable region genes. Lightchains are classified as either kappa or lambda. Heavy chains areclassified as gamma, mu, alpha, delta, or epsilon, which in turn definethe immunoglobulin classes, IgG, IgM, IgA, IgD and IgE, respectively.

A typical immunoglobulin (antibody) structural unit is known to comprisea tetramer. Each tetramer is composed of two identical pairs ofpolypeptide chains, each pair having one “light” (about 25 kD) and one“heavy” chain (about 50-70 kD). The N-terminus of each chain defines avariable region of about 100 to 110 or more amino acids primarilyresponsible for antigen recognition. The terms variable light chain(V_(L)) and variable heavy chain (V_(H)) refer to these light and heavychains respectively.

Antibodies exist as intact immunoglobulins or as a number ofwell-characterized fragments. Thus, for example, pepsin digests anantibody below the disulfide linkages in the hinge region to produceF(ab)′₂, a dimer of Fab which itself is a light chain joined to VH-CH1by a disulfide bond. The F(ab)′₂ may be reduced under mild conditions tobreak the disulfide linkage in the hinge region thereby converting the(Fab′)₂ dimer into an Fab′ monomer. The Fab′ monomer is essentially anFab with part of the hinge region (see, Fundamental Immunology, W. E.Paul, ed., Raven Press, N.Y. (1993), for a more detailed description ofother antibody fragments). While various antibody fragments are definedin terms of the digestion of an intact antibody, one of skill willappreciate that fragments can be synthesized de novo either chemicallyor by utilizing recombinant DNA methodology. Thus, the term antibody, asused herein also includes antibody fragments either produced by themodification of whole antibodies or synthesized using recombinant DNAmethodologies. Preferred antibodies include V_(H)-V_(L) dimers,including single chain antibodies (antibodies that exist as a singlepolypeptide chain), such as single chain Fv antibodies (sFv or scFv) inwhich a variable heavy and a variable light region are joined together(directly or through a peptide linker) to form a continuous polypeptide.The single chain Fv antibody is a covalently linked V_(H)-V_(L)heterodimer which may be expressed from a nucleic acid including V_(H)-and V_(L)-encoding sequences either joined directly or joined by apeptide-encoding linker (e.g., Huston, et al. Proc. Nat. Acad. Sci. USA,85:5879-5883, 1988). While the V_(H) and V_(L) are connected to each asa single polypeptide chain, the V_(H) and V_(L) domains associatenon-covalently. Alternatively, the antibody can be another fragment.Other fragments can also be generated, including using recombinanttechniques. For example Fab molecules can be displayed on phage if oneof the chains (heavy or light) is fused to g3 capsid protein and thecomplementary chain exported to the periplasm as a soluble molecule. Thetwo chains can be encoded on the same or on different replicons; the twoantibody chains in each Fab molecule assemble post-translationally andthe dimer is incorporated into the phage particle via linkage of one ofthe chains to g3p (see, e.g., U.S. Pat. No: 5,733,743). The scFvantibodies and a number of other structures converting the naturallyaggregated, but chemically separated light and heavy polypeptide chainsfrom an antibody V region into a molecule that folds into a threedimensional structure substantially similar to the structure of anantigen-binding site are known to those of skill in the art (see e.g.,U.S. Pat. Nos. 5,091,513, 5,132,405, and 4,956,778). In someembodiments, the scFv is a diabody as described in Holliger et al.,Proc. Nat'l Acad. Sci. vol. 90, pp. 6444-6448 (1993), which isincorporated by reference in its entirety for all purposes. In someembodiments, antibodies include all those that have been displayed onphage or generated by recombinant technology using vectors where thechains are secreted as soluble proteins, e.g., scFv, Fv, Fab, pr (Fab′)₂or generated by recombinant technology using vectors where the chainsare secreted as soluble proteins. Antibodies can also includediantibodies and miniantibodies.

Antibodies of the invention also include heavy chain dimers, such asantibodies from camelids. Since the V_(H) region of a heavy chain dimerIgG in a camelid does not have to make hydrophobic interactions with alight chain, the region in the heavy chain that normally contacts alight chain is changed to hydrophilic amino acid residues in a camelid.V_(H) domains of heavy-chain dimer IgGs are called V_(HH) domains.

In camelids, the diversity of antibody repertoire is determined by thecomplementary determining regions (CDR) 1, 2, and 3 in the V_(H) orV_(HH) regions. The CDR3 in the camel V_(HH) region is characterized byits relatively long length averaging 16 amino acids (Muyldermans et al.,1994, Protein Engineering 7(9): 1129). This is in contrast to CDR3regions of antibodies of many other species. For example, the CDR3 ofmouse V_(H) has an average of 9 amino acids.

Libraries of camelid-derived antibody variable regions, which maintainthe in vivo diversity of the variable regions of a camelid, can be madeby, for example, the methods disclosed in U.S. Patent Application Ser.No. 20050037421, published Feb. 17, 2005.

As used herein, the term “naturally occurring” means that the componentsare encoded by a single gene that was not altered by recombinant meansand that pre-exists in an organism, e.g., in an antibody library thatwas created from naive cells or cells that were exposed to an antigen.

As used herein, the term “antigen” refers to substances that arecapable, under appropriate conditions, of inducing a specific immuneresponse and of reacting with the products of that response, such as,with specific antibodies or specifically sensitized T-lymphocytes, orboth. Antigens may be soluble substances, such as toxins and foreignproteins, or particulates, such as bacteria and tissue cells; however,only the portion of the protein or polysaccharide molecule known as theantigenic determinant (epitopes) combines with the antibody or aspecific receptor on a lymphocyte. More broadly, the term “antigen” maybe used to refer to any substance to which an antibody binds, or forwhich antibodies are desired, regardless of whether the substance isimmunogenic. For such antigens, antibodies may be identified byrecombinant methods, independently of any immune response.

As used herein, the term “epitope” refers to the site on an antigen orhapten to which specific B cells and/or T cells respond. The term isalso used interchangeably with “antigenic determinant” or “antigenicdeterminant site”. Epitopes include that portion of an antigen or othermacromolecule capable of forming a binding interaction that interactswith the variable region binding pocket of an antibody.

As used herein, the term “binding specificity” of an antibody refers tothe identity of the antigen to which the antibody binds, preferably tothe identity of the epitope to which the antibody binds.

As used herein, the term “chimeric polynucleotide” means that thepolynucleotide comprises regions which are wild-type and regions whichare mutated. It may also mean that the polynucleotide compriseswild-type regions from one polynucleotide and wild-type regions fromanother related polynucleotide.

As used herein, the term “complementarity-determining region” or “CDR”refer to the art-recognized term as exemplified by the Kabat andChothia. CDRs are also generally known as hypervariable regions orhypervariable loops (Chothia and Lesk (1987) J Mol. Biol. 196: 901;Chothia et al. (1989) Nature 342: 877; E. A. Kabat et al., Sequences ofProteins of Immunological Interest (National Institutes of Health,Bethesda, Md.) (1987); and Tramontano et al. (1990) J Mol. Biol. 215:175). “Framework region” or “FR” refers to the region of the V domainthat flank the CDRs. The positions of the CDRs and framework regions canbe determined using various well known definitions in the art, e.g.,Kabat, Chothia, international ImMunoGeneTics database (IMGT), and AbM(see, e.g., Johnson et al., supra; Chothia & Lesk, 1987, Canonicalstructures for the hypervariable regions of immunoglobulins. J. Mol.Biol. 196, 901-917; Chothia C. et al., 1989, Conformations ofimmunoglobulin hypervariable regions. Nature 342, 877-883; Chothia C. etal., 1992, structural repertoire of the human VH segments J. Mol. Biol.227, 799-817; Al-Lazikani et al., J. Mol. Biol 1997, 273(4)).Definitions of antigen combining sites are also described in thefollowing: Ruiz et al., IMGT, the international ImMunoGeneTics database.Nucleic Acids Res., 28, 219-221 (2000); and Lefranc, M.-P. IMGT, theinternational ImMunoGeneTics database. Nucleic Acids Res. January 1;29(1):207-9 (2001); MacCallum et al, Antibody-antigen interactions:Contact analysis and binding site topography, J. Mol. Biol., 262 (5),732-745 (1996); and Martin et al, Proc. Natl Acad. Sci. USA, 86,9268-9272 (1989); Martin, et al, Methods Enzymol., 203, 121-153, (1991);Pedersen et al, Immunomethods, 1, 126, (1992); and Rees et al, InSternberg M. J. E. (ed.), Protein Structure Prediction. OxfordUniversity Press, Oxford, 141-172 1996).

As used herein, the term “hapten” is a small molecule that, whenattached to a larger carrier such as a protein, can elicit an immuneresponse in an organism, e.g., such as the production of antibodies thatbind specifically to it (in either the free or combined state). A“hapten” is able to bind to a preformed antibody, but may fail tostimulate antibody generation on its own. In the context of thisinvention, the term “hapten” includes modified amino acids, eithernaturally occurring or non-naturally occurring. Thus, for example, theterm “hapten” includes naturally occurring modified amino acids such asphosphotyrosine, phosphothreonine, phosphoserine, or sulphated residuessuch as sulphated tyrosine (sulphotyrosine), sulphated serine(sulphoserine), or sulphated threonine (sulphothreonine); and alsoinclude non-naturally occurring modified amino acids such asp-nitro-phenylalanine.

As used herein, the term “heterologous” when used with reference toportions of a polynucleotide indicates that the nucleic acid comprisestwo or more subsequences that are not normally found in the samerelationship to each other in nature. For instance, the nucleic acid istypically recombinantly produced, having two or more sequences, e.g.,from unrelated genes arranged to make a new functional nucleic acid.Similarly, a “heterologous” polypeptide or protein refers to two or moresubsequences that are not found in the same relationship to each otherin nature.

As used herein, the term “host cell” refers to a prokaryotic oreukaryotic cell into which the vectors of the invention may beintroduced, expressed and/or propagated. A microbial host cell is a cellof a prokaryotic or eukaryotic micro-organism, including bacteria,yeasts, microscopic fungi and microscopic phases in the life-cycle offungi and slime molds. Typical prokaryotic host cells include variousstrains of E. coli. Typical eukaryotic host cells are yeast orfilamentous fungi, or mammalian cells, such as Chinese hamster ovarycells, murine NIH 3T3 fibroblasts, human embryonic kidney 193 cells, orrodent myeloma or hybridoma cells.

As used herein, the term “immunological response” to a composition orvaccine is the development in the host of a cellular and/orantibody-mediated immune response to a composition or vaccine ofinterest. Usually, an “immunological response” includes but is notlimited to one or more of the following effects: the production ofantibodies, B cells, helper T cells, and/or cytotoxic T cells, directedspecifically to an antigen or antigens included in the composition orvaccine of interest. Preferably, the host will display either atherapeutic or protective immunological response such that resistance tonew infection will be enhanced and/or the clinical severity of thedisease reduced. Such protection will be demonstrated by either areduction or lack of symptoms normally displayed by an infected host, aquicker recovery time and/or a lowered viral titer in the infected host.

As used herein, the term “isolated” refers to a nucleic acid orpolypeptide separated not only from other nucleic acids or polypeptidesthat are present in the natural source of the nucleic acid orpolypeptide, but also from polypeptides, and preferably refers to anucleic acid or polypeptide found in the presence of (if anything) onlya solvent, buffer, ion, or other component normally present in asolution of the same. The terms “isolated” and “purified” do notencompass nucleic acids or polypeptides present in their natural source.

As used herein, the term “mammal” refers to warm-blooded vertebrateanimals all of which possess hair and suckle their young.

As used herein, “percentage of sequence identity” and “percentagehomology” are used interchangeably herein to refer to comparisons amongpolynucleotides or polypeptides, and are determined by comparing twooptimally aligned sequences over a comparison window, where the portionof the polynucleotide or polypeptide sequence in the comparison windowmay comprise additions or deletions (i.e., gaps) as compared to thereference sequence for optimal alignment of the two sequences. Thepercentage may be calculated by determining the number of positions atwhich the identical nucleic acid base or amino acid residue occurs inboth sequences to yield the number of matched positions, dividing thenumber of matched positions by the total number of positions in thewindow of comparison and multiplying the result by 100 to yield thepercentage of sequence identity. Alternatively, the percentage may becalculated by determining the number of positions at which either theidentical nucleic acid base or amino acid residue occurs in bothsequences or a nucleic acid base or amino acid residue is aligned with agap to yield the number of matched positions, dividing the number ofmatched positions by the total number of positions in the window ofcomparison and multiplying the result by 100 to yield the percentage ofsequence identity. Those of skill in the art appreciate that there aremany established algorithms available to align two sequences. Optimalalignment of sequences for comparison can be conducted, e.g., by thelocal homology algorithm of Smith and Waterman, Adv Appl Math. 2:482,1981; by the homology alignment algorithm of Needleman and Wunsch, J MolBiol. 48:443, 1970; by the search for similarity method of Pearson andLipman, Proc Natl Acad Sci. USA 85:2444, 1988; by computerizedimplementations of these algorithms (GAP, BESTFIT, FASTA, and TFASTA inthe GCG Wisconsin Software Package), or by visual inspection (seegenerally, Current Protocols in Molecular Biology, F. M. Ausubel et al.,eds., Greene Publishing Associates, Inc. and John Wiley & Sons, Inc.,(1995 Supplement). Examples of algorithms that are suitable fordetermining percent sequence identity and sequence similarity are theBLAST and BLAST 2.0 algorithms, which are described in Altschul et al.,J. Mol. Biol. 215:403-410, 1990; and Altschul et al., Nucleic Acids Res.25(17):3389-3402, 1977; respectively. Software for performing BLASTanalyses is publicly available through the National Center forBiotechnology Information website. BLAST for nucleotide sequences canuse the BLASTN program with default parameters, e.g., a wordlength (W)of 11, an expectation (E) of 10, M=5, N=−4, and a comparison of bothstrands. BLAST for amino acid sequences can use the BLASTP program withdefault parameters, e.g., a wordlength (W) of 3, an expectation (E) of10, and the BLOSUM62 scoring matrix (see Henikoff and Henikoff, ProcNatl Acad Sci. USA 89:10915, 1989). Exemplary determination of sequencealignment and % sequence identity can also employ the BESTFIT or GAPprograms in the GCG Wisconsin Software package (Accelrys, Madison Wis.),using default parameters provided.

As used herein, the terms “protein”, “peptide”, “polypeptide” and“polypeptide fragment” are used interchangeably herein to refer topolymers of amino acid residues of any length. The polymer can be linearor branched, it may comprise modified amino acids or amino acid analogs,and it may be interrupted by chemical moieties other than amino acids.The terms also encompass an amino acid polymer that has been modifiednaturally or by intervention; for example disulfide bond formation,glycosylation, lipidation, acetylation, phosphorylation, or any othermanipulation or modification, such as conjugation with a labeling orbioactive component.

As used herein, the term “purified” means that the indicated nucleicacid or polypeptide is present in the substantial absence of otherbiological macromolecules, e.g., polynucleotides, proteins, and thelike. In one embodiment, the polynucleotide or polypeptide is purifiedsuch that it constitutes at least 95% by weight, more preferably atleast 99.8% by weight, of the indicated biological macromoleculespresent (but water, buffers, and other small molecules, especiallymolecules having a molecular weight of less than 1000 daltons, can bepresent)

As used herein, the term “recombinant nucleic acid” refers to a nucleicacid in a form not normally found in nature. That is, a recombinantnucleic acid is flanked by a nucleotide sequence not naturally flankingthe nucleic acid or has a sequence not normally found in nature.Recombinant nucleic acids can be originally formed in vitro by themanipulation of nucleic acid by restriction endonucleases, oralternatively using such techniques as polymerase chain reaction. It isunderstood that once a recombinant nucleic acid is made and reintroducedinto a host cell or organism, it will replicate non-recombinantly, i.e.,using the in vivo cellular machinery of the host cell rather than invitro manipulations; however, such nucleic acids, once producedrecombinantly, although subsequently replicated non-recombinantly, arestill considered recombinant for the purposes of the invention.

As used herein, the term “recombinant polypeptide” refers to apolypeptide expressed from a recombinant nucleic acid, or a polypeptidethat is chemically synthesized in vitro.

As used herein, the term “recombinant variant” refers to any polypeptidediffering from naturally occurring polypeptides by amino acidinsertions, deletions, and substitutions, created using recombinant DNAtechniques. Guidance in determining which amino acid residues may bereplaced, added, or deleted without abolishing activities of interest,such as enzymatic or binding activities, may be found by comparing thesequence of the particular polypeptide with that of homologous peptidesand minimizing the number of amino acid sequence changes made in regionsof high homology.

Preferably, amino acid “substitutions” are the result of replacing oneamino acid with another amino acid having similar structural and/orchemical properties, i.e., conservative amino acid replacements. Aminoacid substitutions may be made on the basis of similarity in polarity,charge, solubility, hydrophobicity, hydrophilicity, and/or theamphipathic nature of the residues involved. For example, nonpolar(hydrophobic) amino acids include alanine, leucine, isoleucine, valine,proline, phenylalanine, tryptophan, and methionine; polar neutral aminoacids include glycine, serine, threonine, cysteine, tyrosine,asparagine, and glutamine; positively charged (basic) amino acidsinclude arginine, lysine, and histidine; and negatively charged (acidic)amino acids include aspartic acid and glutamic acid.

As used herein, the terms “repertoire” or ““library” refers to a libraryof genes encoding antibodies or antibody fragments such as Fab, scFv,Fd, LC, V_(H), or V_(L), or a subfragment of a variable region, e.g., anexchange cassette, that is obtained from a natural ensemble, or“repertoire”, of antibody genes present, e.g., in human donors, andobtained primarily from the cells of peripheral blood and spleen. Insome embodiments, the human donors are “non-immune”, i.e., notpresenting with symptoms of infection. In the current invention, alibrary or repertoire often comprises members that are exchange cassetteof a given portion of a V region.

As used herein, the term “synthetic antibody library” refers to alibrary of genes encoding one or more antibodies or antibody fragmentssuch as Fab, scFv, Fd, LC, V_(H), or V_(L), or a subfragment of avariable region, e.g., an exchange cassette, in which one or more of thecomplementarity-determining regions (CDR) has been partially or fullyaltered, e.g., by oligonucleotide-directed mutagenesis. “Randomized”means that part or all of the sequence encoding the CDR has beenreplaced by sequence randomly encoding all twenty amino acids or somesubset of the amino acids.

As used herein, a T-cell” is defined to be a hematopoietic cell thatnormally develops in the thymus. T-cells include, but are not limitedto, natural killer T cells, regulatory T cells, helper T cells,cytotoxic T cells, memory T cells, gamma delta T cells and mucosalinvariant T cells. T-cells also include, but are not limited to CD8+T-cells, CD4+ T-cells, Th1 T-cells, and Th2 T-cells.

The singular terms “a”, “an”, and “the” include plural referents unlesscontext clearly indicates otherwise. Similarly, the word “or” isintended to include “and” unless the context clearly indicatesotherwise. Numerical limitations given with respect to concentrations orlevels of a substance, such as an antigen, are intended to beapproximate. Thus, where a concentration is indicated to be at least(for example) 200 it is intended that the concentration be understood tobe at least approximately “about” or “about” 200 μg.

Immune Binding Proteins

In some embodiments, the immune binding protein is an antibody, a T-cellreceptor, or an innate immunity receptor. In some embodiments, theimmune binding protein is from a cell of the immune system including,for example, a B-cell, a plasma cell, a T-cell, a natural killer cell, adendritic cell, or a macrophage.

In some embodiments, antibodies are immune binding proteins that arestructurally defined as comprising an amino acid sequence recognized asbeing derived from the framework region of an immunoglobulin. In someembodiments, an antibody consists of one or more polypeptidessubstantially encoded by immunoglobulin genes or fragments ofimmunoglobulin genes. In some embodiments, the immunoglobulin genesinclude, for example, the kappa, lambda, alpha, gamma, delta, epsilonand mu constant region genes, as well as myriad immunoglobulin variableregion genes. In some embodiments, antibody light chains are classifiedas either kappa or lambda. In some embodiments, antibody heavy chainsare classified as gamma, mu, alpha, delta, or epsilon, which in turndefine the immunoglobulin classes, IgG, IgM, IgA, IgD and IgE,respectively.

In some embodiments, antibodies exist as intact immunoglobulins or as anumber of well-known fragments. In some embodiments, pepsin digests anantibody below the disulfide linkages in the hinge region to produceF(ab)′₂, a dimer of Fab which itself is a light chain joined to VH-CH1by a disulfide bond. In some embodiments, the F(ab)'2 may be reducedunder mild conditions to break the disulfide linkage in the hinge regionthereby converting the (Fab′)₂ dimer into Fab′ monomers. In someembodiments, the Fab′ monomer is an Fab with part of the hinge region(see, Fundamental Immunology, W. E. Paul, ed., Raven Press, N.Y. (1993),which is incorporated by reference in its entirety for all purposes). Insome embodiments, antibody fragments are synthesized de novo eitherchemically or by utilizing recombinant DNA methodology. In someembodiments, antibodies include V_(H)-V_(L) dimers, including singlechain antibodies (antibodies that exist as a single polypeptide chain),such as diabodies, or single chain Fv antibodies (sFv or scFv) in whicha variable heavy and a variable light region are joined together(directly or through a peptide linker) to form a continuous polypeptide.(e.g., Huston, et al. Proc. Nat. Acad. Sci. USA, 85:5879-5883, 1988,which is incorporated by reference in its entirety for all purposes). Insome embodiments, antibodies can be another fragment, including, forexample, Fab molecules displayed on phage if one of the chains (heavy orlight) is fused to g3 capsid protein and the complementary chainexported to the periplasm as a soluble molecule. (e.g., U.S. Pat. No.5,733,743, which is incorporated by reference in its entirety for allpurposes). In some embodiments, the antibody is an scFv antibody or anumber of other structures converting the naturally aggregated, butchemically separated light and heavy polypeptide chains from an antibodyV region into a molecule that folds into a three dimensional structuresubstantially similar to the structure of an antigen-binding site areknown to those of skill in the art (e.g., U.S. Pat. Nos. 5,091,513,5,132,405, and 4,956,778, which are all incorporated by reference intheir entirety for all purposes). In some embodiments, the scFv is adiabody as described in Holliger et al., Proc. Nat'l Acad. Sci. vol. 90,pp. 6444-6448 (1993), which is incorporated by reference in its entiretyfor all purposes. In some embodiments, antibodies include all those thathave been displayed on phage or generated by recombinant technologyusing vectors where the chains are secreted as soluble proteins, e.g.,scFv, Fv, Fab, pr (Fab′)₂. Antibodies can also include miniantibodies.In some embodiments, the antibody is from a B-cell, a plasma cell, a Bmemory cell, a pre-B-cell or a progenitor B-cell.

In some embodiments, the immune binding protein is a T-cell receptor. Insome embodiments, the T-cell receptor is from a CD8+ T-cell, a CD4+T-cell, a regulatory T-cell, a memory T-cell, a helper T-cell, or acytotoxic T-cell. In some embodiments, T-cell receptors are obtainedfrom either (or both) the genomic DNA of the T-cells (or subpopulationof T-cells) and/or the mRNA of the T-cells (or subpopulation ofT-cells). In some embodiments, repertoires of T-cell receptors areobtained using techniques and primers well known in the art anddescribed in, for example, SMARTer Human TCR a/b Profiling Kits soldcommercially by Clontech, Boria et al., BMC Immunol. 9:50-58 (2008);Moonka et al., J. Immunol. Methods 169:41-51 (1994); Kim et al., PLoSONE 7:e37338 (2012); Seitz et al., Proc. Natl Acad. Sci. 103:12057-62(2006), all of which are incorporated by reference in their entirety forall purposes. In some embodiments, the T-cell receptors are used asseparate chains to form an immune binding protein. In some embodiments,the T-cell receptors are converted to single chain antigen bindingdomains. In some embodiments, single chain T-cell receptors are madefrom nucleic acids encoding human alpha and beta chains using techniqueswell-known in the art including, for example, those described in U.S.Patent Application Publication No. US2012/0252742, Schodin et al., Mol.Immunol. 33:819-829 (1996); Aggen et al., “Engineering HumanSingle-Chain T Cell Receptors,” Ph.D. Thesis with the University ofIllinois at Urbana-Champaign (2010) a copy of which is found atideals.illinois.edu/bistream/handle/2142/18585/Aggen_David.pdf?sequence=1,all of which are incorporated by reference in their entirety for allpurposes.

In some embodiments, the immune binding protein is an innate immunityreceptor. In some embodiments, natural killer cells, dendritic cells,macrophages, T-cells, and/or B-cells are used to make a NKG receptorbinding proteins and/or Toll-like receptor binding proteins. In someembodiments, the natural killer cells, dendritic cells, macrophages,T-cells, and/or B-cells are obtained from a subject who has becomeimmune to a disease or has had an immune response to a disease orcondition. In some embodiments, the immune binding proteins is obtainedfrom the CD94/NKG2 receptor family (e.g., NKG2A, NKG2B, NKG2C, NKG2D,NKG2E, NKG2F, NKG2H), the 2B4 receptor, the NKp30, NKp44, NKp46, andNKp80 receptors, the Toll-like receptors (e.g., TLR1, TLR2, TLR3, TLR4,TLR5, TLR6, TLR7, TLR8, TLR9, TLR10, RP105), and/or innate immunityreceptors are obtained from the subjects immune cells (natural killercells, dendritic cells, macrophages, T-cells, and B-cells). In someembodiments, the immune binding proteins of the invention are made asdescribed in U.S. Pat. Nos. 5,359,046, 5,686,281 and 6,103,521 (whichare hereby incorporated by reference in their entirety for allpurposes). In some embodiments, the immune binding protein is part of areceptor which is monomeric, homodimeric, heterodimeric, or associatedwith a larger number of proteins in a non-covalent complex. In someembodiments, a multimeric receptor has only one polypeptide chain with amajor role in binding to the ligand. In these embodiments, the immunebinding protein can be derived from the polypeptide chain that binds theligand. In some embodiments, the immune binding protein is a complex ofextracellular portions from several proteins that forms covalent bondsthrough disulfide linkages. In some embodiments, the immune bindingprotein is comprised of truncated portions of a receptor, where suchtruncated portion is functional for binding ligand.

Methods for Amplifying Nucleic Acids Encoding Multimeric Immune Proteins

The invention relates to methods for making nucleic acids encodingimmune binding proteins that preserve the in vivo multimericassociations of the immune polypeptide chains making up the immunebinding protein (e.g., antibodies, T-lymphocyte receptors or innateimmunity receptors). In some embodiments, immune binding proteinlibraries of the invention are enriched for nucleic acids encodingmultimers that are functional polypeptides representing the multimericcomplexes found in the repertoire from which the immune binding proteinlibrary was obtained. In some embodiments, the nucleic acids encodingthe polypeptide chains for immune binding proteins are derived fromindividuals whom have mounted an immune response relevant to, forexample, an infectious disease, a cancer, an autoimmune disease, anallergy, or a neurodegenerative disease. In some embodiments, theinfectious disease is caused by an influenza virus. In some embodiments,the infectious disease is caused by a virus such as, for example, HIV,Ebola, Zika, HSV, RSV, or CMV.

In some embodiments, the immune binding proteins are antibodies or areimmune binding proteins derived from antibodies. In some embodiments,the immune binding proteins are T-cell receptors from, for example,cytotoxic T-cells, helper T-cells, and memory T-cells. In someembodiments, the immune binding proteins are innate immune receptorssuch as, for example the CD94/NKG2 receptor family (e.g., NKG2A, NKG2B,NKG2C, NKG2D, NKG2E, NKG2F, NKG2H), the 2B4 receptor, the NKp30, NKp44,NKp46, and NKp80 receptors, the Toll-like receptors (e.g., TLR1, TLR2,TLR3, TLR4, TLR5, TLR6, TLR7, TLR8, TLR9, TLR10, RP105).

In some embodiments, immune binding proteins are made from individualcells that are placed into microwells and/or an emulsion. In someembodiments, forward (F) and reverse (R) primers are used for eachindividual chain of the immune binding protein (e.g., heavy (H) andlight (L) chain primers designated HF, HR, LF, and LR), as well as apolymerase enzyme and dNTPs to carry out template-directedamplification. In some embodiments, the primers for an individual chainof the immune binding protein (e.g., the HF and HL primers for anantibody heavy chain and/or alternatively the LF and HR primers for theantibody light chain) contain an overlap extension region (OE) such thatduring cycled amplification the primers for one chain extend (amplify)nucleic acids encoding the other chains of the immune binding protein.In some embodiments, a joint polypeptide (such as a scFv or a singlechain T-cell receptor) can be encoded by the amplified nucleic acids,and the OE region can optionally encode an amino acid linker sequence.

In some embodiments, the amplification reaction is carried out,resulting in a nucleic acid which codes for each of the polypeptidesfrom the immune binding protein (e.g., both a heavy and a light chain ofan antibody). In some embodiments, the nucleic acid obtained from eachwell and/or emulsion is homogeneous and encodes the immune bindingprotein (e.g., antibody) made by the single cell placed in the microwelland/or emulsion. In some embodiments, nucleic acids obtained from thewells and/or emulsions are pooled to form a library of heavy/light chainpairs that reflect the pairing of the antibody chains from the sourcecells or genetic material.

In some embodiments, the resulting pool of nucleic acids encoding pairedheavy and light chains for the antibodies are cloned into an expressionvector or can be processed for sequencing. In some embodiments, theexpression vector is engineered for phage display, yeast display, orother display technology. In some embodiments, the expression vector isfor secretion expression and recombinant production of the antibodies.In some embodiments, the expression vector is for making a library ofchimeric antigen receptors, where each CAR has one of the pairedantibody clones obtained from the amplification reaction. In someembodiments, primers corresponding to heavy chains or light chains maybe targeted to single isotypes of antibodies (e.g., IgG), or pools ofprimers corresponding to all available isotypes or some fraction thereofmay be used.

In some embodiments, primers for the light chain and heavy chain arelinked together so that each primer is capable of priming a reaction. Insome embodiments, a 5′ azide-alkyne reaction (“Click”) coupling canbring together the heavy and light chain primers. In this embodiment,the dual primer is incubated with single cells in a well or emulsion,and nucleic acids are obtained where a nucleic acid encoding a heavychain is linked to a nucleic acid encoding the paired light chain. Insome embodiments, a microsurface (e.g., bead or microwell) is preparedand contains primer sequences that are capable of binding either heavyor light chain nucleic acids. Following mRNA capture, cDNA synthesis orPCR from a single cell in a spatial confinement with the primers in thewell or on the bead, nucleic acids encoding the paired heavy and lightchains become co-located with the heavy and light chain primers of thesolid phase.

In some embodiments, nucleic acid probes for nucleic acids encodingheavy and light chain polypeptides are placed on a solid surface. Inthis embodiment, the probes for nucleic acids encoding heavy and lightchain antibody polypeptides are interrogated with nucleic acids, e.g.,mRNA, from a single cell. The probes on the solid phase will capturepaired light and heavy chains encoding nucleic acids from the cell. Insome embodiments, captured mRNA is reverse transcribed to make pairedcDNAs encoding the light chain and heavy chain polypeptides from asingle cell.

In some embodiments, the nucleic acids encoding the subunits of theimmune binding protein are bar coded to enable identification of uniquemolecules. In some embodiments, a solid phase with a cell-specificbarcode is made with spatially confined PCR reactions of a plurality ofsingle template molecules containing a linker/adapter primer sequence, arandom barcode sequence, and a secondary primer sequence. In someembodiments, a limited dilution of template molecules is used, and thetemplate molecule is linked to a solid phase at very low loading ratesto ensure only a single molecule is available as a template at eachsite. In this embodiment, at least one of the primers in this PCRreaction should be attached to the solid phase. In some embodiments,additional molecules may be added to load additional sites, knowing thatpreviously bound sites are incapable of reacting because they wereexhausted during previous rounds of PCR.

In some embodiments, oligonucleotides can be attached at an extremelylow loading rate to a surface and beads are flowed over the surface toensure that each bead binds a single oligonucleotide. In someembodiments, beads are reflowed over the surface without being subjectedto the constraints of poissonian loading. In some embodiments, amoderate surface of 100 cm², hundreds of millions of beads can be boundto individual molecules. In some embodiments, each bound bead would beguaranteed to have one and only one template sequence. In someembodiments, each spatially confined site (either a position or well ona patterned surface, or bead in emulsion) will contain the same barcodedDNA in close proximity, whereas other sites will each contain separatebarcoded DNA in close proximity originating from other single moleculetemplates. In some embodiments, single stranded DNA can be generatedthrough the use of a 5′ nuclease or denaturation of the uncoupled secondstrand. In this embodiment, the secondary primer sequence is availableto perform a subsequent barcode extension reaction, or can be useddirectly to capture nucleic acids from single cells. In someembodiments, the bead can be ligated to a sequence containing a linkersection and a fully random sequence to serve as a unique molecularidentifier, and a tertiary primer sequence. In this embodiment, thetertiary primer sequence is available to perform a subsequent barcodeextension reaction, or can be used directly to capture nucleic acidsfrom single cells.

In some embodiments, a surface (e.g., glass surface) is selectivelysilanized and functional alkane or PEG (eg FSL, amino, azide, DBCO,fibrous group) is attached in an array of spots that are smaller thanthe size of the bead or diameter of the cells to be captured. In someembodiments, the remaining surface is silanized with passivating silane(e.g., alkane or PEG). Functional sites may be additionally modifiedwith proteins or moieties to capture desired cells or specific types ofcells. For example, CD19 can be attached to the surface for the captureof B cells from a cell mixture. Target cells are incubated with thesurface at concentrations where a small number of cells are captured ateach site. The cells are then non-poisonnianly loaded into the array. Insome embodiments, a self-assembling hydrogel is generated on top of eachcell, for example, using PEG x4 dendrimer DBCO and PEG 10 kda azide anda heterobifunctional linkage such as DBCO NHS for initial attachment tothe cells or array position. Additional molecules may be incorporated inthe hydrogel for capture of desired targets. In some embodiments,Protein G is attached for antibody capture, or poly dT oligonucleotidesare attached for mRNA capture. Cells in this matrix may then beincubated with molecules for capture of matrix bound agents andtherefore labelled, such as primers, DNA molecules, protein antigens, orantibodies. In some embodiments, a lysis solution is added to the cellson the surface, the cells are lysed, and their contents captured withinthe hydrogel matrix. In some embodiments, various reagents are flowedover the surface, such as wash buffers to remove reagents from a priorstep, whilst maintaining bound RNA. In some embodiments, new reagentsfor a next step are added in this manner, such as, for example a reversetranscriptase solution containing enzyme and suitable buffer for thesynthesis of a cDNA library for each cell. In some embodiments, it maybe preferable to replace the non-hydrogel aqueous phase with ahydrocarbon or fibrous oil phase to prevent diffusion of intracellularor extracellular bound materials out of the matrix.

In some embodiments, the surface is patterned with hydrophilic spots ona hydrophobic or fibrous background. In this embodiment, droplets willself-assemble on the surface and be ready for subsequent reactions.These droplets may be used to generate hydrogels as well using clickchemistry as described above. In some embodiments, the spots are on theorder of the size of a cell and single cells can be captured in anonpoissonian manner. In some embodiments, the spots are much largerthan a single cell and capture of single cells occurs in a poissonianfashion. In some embodiments, patterning is random rather than arrayedthough this may result in lower loading densities.

In some embodiments, each spot contains a plurality of poly-dt primerswith the same 5′ random DNA barcode so that each cell's mRNA can bespecifically labelled. In some embodiments, a patterned surface is usedto first capture a single bead that is smaller than the cell, but largerthan the capture site. For example, a capture site of 1 um combined witha bead size of 2 um. In some embodiments, the beads are functionalizedso that they can attach to both a cell and the capture site. Forexample, the beads can be coated with NHS and DBCO, while the capturesites have an azide. After attachment of beads to the capture site,cells are flowed so that each bead captures a single cell.

Once the cells are arrayed, it may be advantageous to transfer them to amicrowell array containing other reagents for additional workup, such aslysis and capture of mRNA to primer coated beads. This enablesnon-poissonian loading of cells and/or beads to a microwell array.

These techniques can be used to capture single cells for RNA capture onbarcoded beads, or to exactly position a single bead at each capturesite for additional workup. For example, barcoded cDNA on a bead may beput on the capture array so that a single bead is at each spot. In someembodiments, a PCR reaction may be performed that amplifies the barcodedsection of each molecule and amplifies a particular region of a subsetof molecules of interest (for instance heavy and light chains), thenlinks the barcode to the particular region of interest via ligation orassembly PCR. In this manner a sequencing read will contain the regionof interest and the barcode and not be subject to the barcode being onthe 5′ or 3′ ends of a molecule longer than the sequencing read length.

Sequencing of Immune Binding Proteins

In some embodiments, the amplified nucleic acids are used in asequencing reaction and the OE region can be flanked by one or morebarcode regions (BC1 and BC2) (FIG. 1b). In some embodiments, thenucleic acids encoding the multiple chains of the immune binding proteinare sequenced to identify the chains which form the immune bindingprotein (e.g., the heavy and light chains of an antibody).

Sequencing tools, methods, apparati, and reagents are well known to theperson of ordinary skill in the art and include, for example,single-molecule real-time sequencing (Pacific Biosciences), ionsemiconductor (Ion Torrent sequencing of Thermo Fisher), pyrosequencing(454 Life Sciences of Roche Diagnostics), sequencing by synthesis(Illumina), sequencing by ligation (SOLiD sequencing, Thermo Fisher),DNA nanoball sequencing (Complete Genomics), heliscope sequencing(Helicos Biosciences), and chain termination (Sanger sequencing).Sequencing machines and reagents are commercially available for all ofthese techniques, including for example, from Pacific Biosciences,Thermo Fisher, Roche Diagnostics, Illumina, Complete Genomics, andHelicos Biosciences.

In some embodiments, the resulting sequences are characterized forputative lineage information based on sequence alignment. In someembodiments, the sequence information is analyzed for similarity scoresbetween sequences using bioinformatics tools (e.g. BLAST), and thenoptionally grouped into a phylogeny tree based on this information.

In some embodiments, sequences are compared using techniques well knownto the person of ordinary skill in the art, including, for example, thelocal homology algorithm of Smith and Waterman, Adv Appl Math. 2:482,1981; the homology alignment algorithm of Needleman and Wunsch, J MolBiol. 48:443, 1970; the search for similarity method of Pearson andLipman, Proc Natl Acad Sci. USA 85:2444, 1988; computerizedimplementations of these algorithms (GAP, BESTFIT, FASTA, and TFASTA inthe GCG Wisconsin Software Package), or visual inspection (seegenerally, Current Protocols in Molecular Biology, F. M. Ausubel et al.,eds., Greene Publishing Associates, Inc. and John Wiley & Sons, Inc.,(1995 Supplement). Examples of algorithms that are suitable forcomparing percent sequence identity and sequence similarity are theBLAST and BLAST 2.0 algorithms, which are described in Altschul et al.,J Mol. Biol. 215:403-410, 1990; and Altschul et al., Nucleic Acids Res.25(17):3389-3402, 1977; respectively. Software for performing BLASTanalyses is publicly available through the National Center forBiotechnology Information website. BLAST for nucleotide sequences canuse the BLASTN program with default parameters, e.g., a wordlength (W)of 11, an expectation (E) of 10, M=5, N=−4, and a comparison of bothstrands. BLAST for amino acid sequences can use the BLASTP program withdefault parameters, e.g., a wordlength (W) of 3, an expectation (E) of10, and the BLOSUM62 scoring matrix (see Henikoff and Henikoff, ProcNatl Acad Sci. USA 89:10915, 1989). Exemplary determination of sequencealignment and % sequence identity can also employ the BESTFIT or GAPprograms in the GCG Wisconsin Software package (Accelrys, Madison Wis.),using default parameters provided.

Repertoires of Immune Binding Proteins

The invention relates to nucleic acids encoding immune binding proteinsthat preserve the in vivo multimeric associations of the immunepolypeptide chains making up the immune binding protein (e.g.,antibodies, T-lymphocyte receptors or innate immunity receptors). Insome embodiments, immune binding protein libraries of the invention areenriched for nucleic acids encoding multimers that are functionalpolypeptides representing the multimeric complexes found in therepertoire from which the immune binding protein library was obtained.

In some embodiments, the nucleic acids represent the antibody repertoireof a subject who has become immune to an infectious disease, cancer, orother immunogenic challenge. In some embodiments, the nucleic acidsrepresent the antibody repertoire of a subject who has had an immunereaction to an infectious disease, cancer, or other immunogenicchallenge. In some embodiments, the antibody repertoire is from asubject that is naïve for the target antigen. In some embodiments, theantibody repertoire represents the germ line repertoire of a subject orspecies. In some embodiments, the nucleic acids encoding the heavy andlight chains of the antibody are combined in appropriate combinatorialfashion to generate a repertoire of antigen binding domains from theheavy and light chains.

In some embodiments, the repertoire represents the T-cell receptorrepertoire of a subject who has become immune to an infectious disease,cancer, or other immunogenic challenge. In some embodiments, the nucleicacids represent the T-cell receptor repertoire of a subject who has hadan immune reaction to an infectious disease, cancer, or otherimmunogenic challenge. In some embodiments, the T-cell receptorrepertoire is from a subject that is naïve for the target antigen. Insome embodiments, the T-cell receptor repertoire represents the germline repertoire of a subject or species. In some embodiments, thenucleic acids encoding the alpha, beta, gamma and zeta chains of theT-cell receptor are combined in appropriate combinatorial fashion togenerate a repertoire of antigen binding domains from the T-cellreceptor chains.

In some embodiments, the nucleic acids represent the innate immunityreceptor repertoire of a subject who has become immune to an infectiousdisease, cancer, or other immunogenic challenge. In some embodiments,the nucleic acids represent the innate immunity receptor repertoire of asubject who has had an immune reaction to an infectious disease, cancer,or other immunogenic challenge. In some embodiments, the innate immunityreceptor repertoire is from a subject that is naïve for the targetantigen. In some embodiments, the innate immunity receptor repertoirerepresents the germ line repertoire of a subject or species.

In some embodiments, the nucleic acids encoding the polypeptide chainsfor immune binding proteins are derived from individuals whom havemounted an immune response relevant to, for example, an infectiousdisease, a cancer, an autoimmune disease, an allergy, or aneurodegenerative disease. In some embodiments, the infectious diseaseis caused by an influenza virus. In some embodiments, the infectiousdisease is caused by a virus such as, for example, HIV, Ebola, Zika,HSV, RSV, or CMV.

Homologs of immune binding polypeptides of the invention are intended tobe within the scope of the present invention. As used herein, the term“homologs” includes analogs and paralogs. The term “anologs” refers totwo polynucleotides or polypeptides that have the same or similarfunction, but that have evolved separately in unrelated host organisms.The term “paralogs” refers to two polynucleotides or polypeptides thatare related by duplication within a genome. Paralogs usually havedifferent functions, but these functions may be related. Analogs andparalogs of an immune binding protein can differ from the immune bindingprotein by post-translational modifications, by amino acid sequencedifferences, or by both. In particular, homologs of the invention willgenerally exhibit at least 80-85%, 85-90%, 90-95%, or 95%, 96%, 97%,98%, 99% sequence identity, with all or part of the immune bindingprotein or its polynucleotide sequences, and will exhibit a similarfunction. Variants include allelic variants. The term “allelic variant”refers to a polynucleotide or a polypeptide containing polymorphismsthat lead to changes in the amino acid sequences of a protein and thatexist within a natural population (e.g., a virus species or variety).Such natural allelic variations can typically result in 1-5% variance ina polynucleotide or a polypeptide. Allelic variants can be identified bysequencing the nucleic acid sequence of interest in a number ofdifferent species, which can be readily carried out by usinghybridization probes to identify the same genetic locus in thosespecies. Any and all such nucleic acid variations and resulting aminoacid polymorphisms or variations that are the result of natural allelicvariation and that do not alter the functional activity of the immunebinding protein, are intended to be within the scope of the invention.

As used herein, the term “derivative” or “variant” refers to an immunebinding protein, or a nucleic acid encoding an immune binding protein,that has one or more conservative amino acid variations or other minormodifications such that the corresponding polypeptide has substantiallyequivalent function when compared to the wild type polypeptide. Thesevariants or derivatives include polypeptides having minor modificationsof the immune binding protein primary amino acid sequences that mayresult in peptides which have substantially equivalent activity ascompared to the unmodified counterpart polypeptide. Such modificationsmay be deliberate, as by site-directed mutagenesis, or may bespontaneous. The term “variant” further contemplates deletions,additions and substitutions to the sequence, so long as the polypeptidefunctions as an immune binding protein. The term “variant” also includesmodification of a polypeptide where the native signal peptide isreplaced with a heterologous signal peptide to facilitate the expressionor secretion of the polypeptide from a host species.

The immune binding proteins of the invention also may include amino acidsequences for introducing a glycosylation site or other site formodification or derivatization of the polypeptide. In an embodiment, thepolypeptides of the invention describe above may include the amino acidsequence N-X-S or N-X-T that can act as a glycosylation site. Duringglycosylation, an oligosaccharide chain is attached to asparagine (N)occurring in the tripeptide sequence N-X-S or N-X-T, where X can be anyamino acid except Pro. This sequence is called a glycosylation sequon.This glycosylation site may be placed at the N-terminus, C-terminus, orwithin the internal sequence of the protein sequence used for thepolypeptide of the invention.

Display Libraries of the Immune Binding Proteins

In some embodiments, the nucleic acids encoding immune binding proteinsof the invention are engineered into vectors for displaying the immunebinding protein on the surface of a cell or a viral particle. In someembodiments, repertoires of immune binding proteins (e.g., antibodies,T-cell receptors, or innate immunity receptors) are displayed onfilamentous bacteriophage (e.g., McCafferty et al., 1990, Nature348:552-554, which is incorporated by reference in its entirety for allpurposes), yeast cells (e.g., Boder and Wittrup, 1997, Nat Biotechnol15:553-557, which is incorporated by reference in its entirety for allpurposes), and ribosomes (e.g., Hanes and Pluckthun, 1997, Proc NatlAcad Sci USA 94:4937-4942, which is incorporated by reference in itsentirety for all purposes). Other embodiments of phage display aredisclosed in, for example, U.S. Pat. Nos. 5,750,373, 5,733,743,5,837,242, 5,969,108, 6,172,197, 5,580,717, and 5,658,727, all of whichare incorporated by reference in their entirety for all purposes.

In some embodiments, phage display libraries are used to make humanantibodies, T-cell receptors (or parts thereof), or innate immunityreceptors (or parts thereof) from immunized humans, non-immunizedhumans, germ line sequences, or naive repertories (Barbas & Burton,Trends Biotech (1996), 14:230; Griffiths et al., EMBO J. (1994),13:3245; Vaughan et al., Nat. Biotech. (1996), 14:309; Winter EP 0368684 B1, all of which are incorporated by reference in their entirety forall purposes). In some embodiments, naive, or nonimmune, antigen bindinglibraries are generated using a variety of lymphoidal tissues. Some ofthese libraries are commercially available, such as those developed byCambridge Antibody Technology and Morphosys (Vaughan et al. (1996)Nature Biotech 14:309; Knappik et al. (1999) J. Mol. Biol. 296:57, allof which are incorporated by reference in their entirety for allpurposes).

In some embodiments, Fab molecules can be displayed on phage if one ofthe chains (heavy or light) is fused to g3 capsid protein and thecomplementary chain exported to the periplasm as a soluble molecule. Thetwo chains can be encoded on the same or on different replicons; the twoantibody chains in each Fab molecule assemble post-translationally andthe dimer is incorporated into the phage particle via linkage to one ofthe chains of g3p (see, e.g., U.S. Pat. No. 5,733,743, which isincorporated by reference in its entirety for all purposes).Alternatively, a scFv can be fused to a g3 capsid protein for display onthe phage particle.

In some embodiments, nucleic acids encoding repertoires of immunebinding proteins are engineered into vectors for display on bacterial,yeast, or mammalian cells. In some embodiments, bacterial, yeast ormammalian cells displaying immune binding proteins of the invention arecontacted with a fluorescently labeled antigen, cells that bind thefluorescently labeled antigen will be fluorescent, and can then beisolated using fluorescence-activated cell sorting. In some embodiments,panning approaches are used to associate immune binding proteins withantigens bound by the immune binding protein.

In some embodiments, a library of immune binding proteins is engineeredinto a phage display vector and transformed into cells to generate phagewhich display the immune binding protein of interest in a fusion withone of the phage coat proteins. The phage library can be contacted with(aka panned against) a surface (e.g. a microtiter plate) that is coatedwith test antigens of interest. The plate is then washed one or moretimes with buffer. Phage that contain antibody variants that bind to theantigen of interest will be retained, whereas those that do not bind tothe antigen will be washed away. The resulting phage library cansubsequently be transformed into other host cells for further screeningor replication and/or characterized by sequencing.

In some embodiments, the heavy chain/light chain pair of an antibody canbe inserted into a surface display vector and cells can be transformedwith this vector to display the antibody on the surface. Separately, aset of one or more antigens can be linked to a set of identifyingnucleic acid barcode sequences such that each different antigen islinked to a unique sequence. The linkage can be done chemically oralternatively by cloning a set of barcoded antigens into a suitabledisplay vector and expressing the antigen on the surface of phage orcells. The antigen set, now linked to a nucleic acid identifier, canthen be contacted with the cells which display antibody on the surface.After the incubation, the individual cells can be isolated via emulsion,single-cell sorting, or other means. The resulting isolate will consistof a single cell displaying a homogeneous antibody on its surface, boundto one or more of the barcoded antigens. The nucleic acids coding forthe antibody heavy chain, light chain, and antigen barcode, can then beamplified together and sequenced. The resulting sequence informationwill yield antibody/antigen coupling information. For example, if oneantibody binds exclusively to a single antigen, the resulting sequenceinformation will yield a unique antibody/antigen sequence. If anantibody binds a plurality of antigens, it will yield a mixed populationof antibody/antigen coupled sequences. Thus, the relative specificity ofeach antibody in the population with respect to a set of antigens can bedetermined. Moreover, the relative abundance of the different coupledspecies can be correlated to the relative affinity of an antibody toeach of the antigens in a panel.

In some embodiments, the pair can be cloned into a chimeric antigenreceptor. A chimeric antigen receptor construct consists of at least abinding region (typically an scFv) and an intracellular signalingregion. It may additionally contain other components such as atransmembrane region, a spacer/linker region, multiple signalingregions, and/or protein targeting and translocation sequences. Chimericantigen receptors are well known in the art as described in, forexample, U.S. patent application US20140242701, and U.S. Pat. Nos.5,359,046, 5,686,281 and 6,103,521, which are incorporated by referencein their entirety for all purposes. The construct is placed into cellsand the receptor is expressed, typically though not necessarily on thesurface of a mammalian T cell. Upon the scFv binding to an antigen, thesignaling domain initiates a cascade of events that ultimately resultsin transcription and activation of genes. In one example, the cell isfurther modified with a construct that expresses a marker protein, suchas a fluorescent protein, luminescent protein, enzyme, or selectablemarker that allows differentiation between that cell and othernon-activated cells in the population. Thus, a population of cellscontaining a library of antibody constructs can be screened for thosecells which are activated by binding to a target.

Immune Binding Protein and Antigens

Immune binding proteins bind a very diverse spectrum of antigens, withvarying levels of affinity and specificity. In some embodiments, immunebinding proteins bind very specific antigens, while other immune bindingproteins bind a broader array of antigens. Depending on the application,either one of these options may be desired. For example, an immunebinding protein that can recognize multiple strains of influenza wouldhave benefit against may strains of influenza, whereas an immune bindingprotein for an anti-tumor therapy may need to bind only one veryspecific conformation of an antigen, to avoid attacking normal versionsof the antigen present on healthy cells and tissues.

In some embodiments, a repertoire of immune binding proteins (e.g.,antibodies, T-cell receptors, and/or innate immunity receptors) made bythe methods of the invention is screened against a panel of antigens. Insome embodiments, each member of the panel of antigens is labeled withnucleic acids encoding unique bar codes for each antigen. In someembodiments, the screening of multiple antigens is followed byamplification reactions that produce nucleic acids encoding thepolypeptide chains of the immune binding protein (e.g., the heavy andlight chains of an antibody) and the antigen (e.g., if the antigen is apolypeptide) or a nucleic acid bar code for the antigen. In someembodiments, immune binding proteins are displayed on a cell surface andscreened against a panel of bar-coded antigens. Those cells withdisplayed immune binding proteins that bind an antigen are place inmicrowells (single cell in each microwell) and/or capture in anemulsion, and amplification reactions are performed to make nucleicacids encoding the chains of the immune binding protein and the bar codeof the antigen.

In some embodiments, an amplification reaction as describe above for animmune protein is used adding a set of forward and reverse primers foramplification of the nucleic acid attached to the antigen (AF and AR)(FIG. 1C). In some embodiments, the AR primer additionally contains abarcode (BC5) and an OE region matching that of a primer for a nucleicacid encoding one of the chains of the immune protein (e.g., the LFprimer for an antibody). The amplification is carried out, resulting ina mixture of nucleic acids encoding the immune protein (e.g., HC/LCmolecules) and nucleic acids encoding a chain of the immune protein andthe nucleic acid for identifying the antigen (e.g., HC/Antigenmolecules). In some embodiments, these molecules are sequenced usinghigh-throughput methods, and the resulting information identifiesantigens with individual immune binding proteins (e.g., antibodies).

In some embodiments, a second overlap extension (OE) is placed on the BRand immune protein primers (e.g., for an antibody the LF primer). Inthis embodiment, following amplification one obtains a nucleic acidencoding the chains for the immune binding protein (e.g., heavy andlight chains of an antibody), and the bar code for the antigen. In someembodiments, this multipartite nucleic acid is sequenced to identify theimmune binding protein, and the antigens to which the immune bindingprotein bound.

Nucleic Acids

In some embodiments, the present invention relates to the nucleic acidsthat encode, at least in part, the individual peptides, polypeptides,proteins, and RNA control devices of the present invention. In someembodiments, the nucleic acids may be natural, synthetic or acombination thereof. The nucleic acids of the invention may be RNA,mRNA, DNA or cDNA.

In some embodiments, the nucleic acids of the invention also includeexpression vectors, such as plasmids, or viral vectors, or linearvectors, or vectors that integrate into chromosomal DNA. Expressionvectors can contain a nucleic acid sequence that enables the vector toreplicate in one or more selected host cells. Such sequences are wellknown for a variety of cells. The origin of replication from the plasmidpBR322 is suitable for most Gram-negative bacteria. In eukaryotic hostcells, e.g., mammalian cells, the expression vector can be integratedinto the host cell chromosome and then replicate with the hostchromosome. Similarly, vectors can be integrated into the chromosome ofprokaryotic cells.

Expression vectors also generally contain a selection gene, also termeda selectable marker. Selectable markers are well-known in the art forprokaryotic and eukaryotic cells, including host cells of the invention.Generally, the selection gene encodes a protein necessary for thesurvival or growth of transformed host cells grown in a selectiveculture medium. Host cells not transformed with the vector containingthe selection gene will not survive in the culture medium. Typicalselection genes encode proteins that (a) confer resistance toantibiotics or other toxins, e.g., ampicillin, neomycin, methotrexate,or tetracycline, (b) complement auxotrophic deficiencies, or (c) supplycritical nutrients not available from complex media, e.g., the geneencoding D-alanine racemase for Bacilli. In some embodiments, anexemplary selection scheme utilizes a drug to arrest growth of a hostcell. Those cells that are successfully transformed with a heterologousgene produce a protein conferring drug resistance and thus survive theselection regimen. Other selectable markers for use in bacterial oreukaryotic (including mammalian) systems are well-known in the art.

In some embodiments, an example of a promoter that is capable ofexpressing a transgene encoding an immune binding protein of theinvention in a mammalian host cell is the EF1a promoter. The native EF1apromoter drives expression of the alpha subunit of the elongationfactor-1 complex, which is responsible for the enzymatic delivery ofaminoacyl tRNAs to the ribosome. The EF1a promoter has been extensivelyused in mammalian expression plasmids and has been shown to be effectivein driving expression from transgenes cloned into a lentiviral vector.See, e.g., Milone et al., Mol. Ther. 17(8): 1453-1464 (2009), which isincorporated by reference in its entirety for all purposes. Anotherexample of a promoter is the immediate early cytomegalovirus (CMV)promoter sequence. This promoter sequence is a strong constitutivepromoter sequence capable of driving high levels of expression of anypolynucleotide sequence operatively linked thereto. Other constitutivepromoter sequences may also be used, including, but not limited to thesimian virus 40 (SV40) early promoter, mouse mammary tumor viruspromoter (MMTV), human immunodeficiency virus (HIV) long terminal repeat(LTR) promoter, MoMuLV promoter, phosphoglycerate kinase (PGK) promoter,MND promoter (a synthetic promoter that contains the U3 region of amodified MoMuLV LTR with myeloproliferative sarcoma virus enhancer, see,e.g., Li et al., J. Neurosci. Methods vol. 189, pp. 56-64 (2010) whichis incorporated by reference in its entirety for all purposes), an avianleukemia virus promoter, an Epstein-Barr virus immediate early promoter,a Rous sarcoma virus promoter, as well as human gene promoters such as,but not limited to, the actin promoter, the myosin promoter, theelongation factor-1a promoter, the hemoglobin promoter, and the creatinekinase promoter. Further, the invention is not limited to the use ofconstitutive promoters.

Inducible promoters are also contemplated as part of the invention.Examples of inducible promoters include, but are not limited to ametallothionine promoter, a glucocorticoid promoter, a progesteronepromoter, a tetracycline promoter, a c-fos promoter, the T-REx system ofThermoFisher which places expression from the human cytomegalovirusimmediate-early promoter under the control of tetracycline operator(s),and RheoSwitch promoters of Intrexon. Karzenowski, D. et al.,BioTechiques 39:191-196 (2005); Dai, X. et al., Protein Expr. Purif42:236-245 (2005); Palli, S. R. et al., Eur. J. Biochem. 270:1308-1515(2003); Dhadialla, T. S. et al., Annual Rev. Entomol. 43:545-569 (1998);Kumar, M. B, et al., J. Biol. Chem. 279:27211-27218 (2004); Verhaegent,M. et al., Annal. Chem. 74:4378-4385 (2002); Katalam, A. K., et al.,Molecular Therapy 13:S103 (2006); and Karzenowski, D. et al., MolecularTherapy 13:S194 (2006), U.S. Pat. Nos. 8,895,306, 8,822,754, 8,748,125,8,536,354, all of which are incorporated by reference in their entiretyfor all purposes.

Expression vectors of the invention typically have promoter elements,e.g., enhancers, to regulate the frequency of transcriptionalinitiation. Typically, these are located in the region 30-110 bpupstream of the start site, although a number of promoters have beenshown to contain functional elements downstream of the start site aswell. The spacing between promoter elements frequently is flexible, sothat promoter function is preserved when elements are inverted or movedrelative to one another. In the thymidine kinase (tk) promoter, thespacing between promoter elements can be increased to 50 bp apart beforeactivity begins to decline. Depending on the promoter, it appears thatindividual elements can function either cooperatively or independentlyto activate transcription.

In some embodiments, control regions suitable for a bacterial host cellsare used in the expression vector. In some embodiments, suitable controlregions for directing transcription of the nucleic acid constructs ofthe invention, include the control regions obtained from the E. coli lacoperon, Streptomyces coelicolor agarase gene (dagA), Bacillus subtilislevansucrase gene (sacB), Bacillus licheniformis alpha-amylase gene(amyL), Bacillus stearothermophilus maltogenic amylase gene (amyM),Bacillus amyloliquefaciens alpha-amylase gene (amyQ), Bacilluslicheniformis penicillinase gene (penP), Bacillus subtilis xylA and xylBgenes, and the prokaryotic beta-lactamase gene, the tac promoter, or theT7 promoter.

In some embodiments, control regions for filamentous fungal host cells,include control regions obtained from the genes for Aspergillus oryzaeTAKA amylase, Rhizomucor miehei aspartic proteinase, Aspergillus nigerneutral alpha-amylase, Aspergillus niger acid stable alpha-amylase,Aspergillus niger or Aspergillus awamori glucoamylase (glaA), Rhizomucormiehei lipase, Aspergillus oryzae alkaline protease, Aspergillus oryzaetriose phosphate isomerase, Aspergillus nidulans acetamidase, andFusarium oxysporum trypsin-like protease (WO 96/00787), as well as theNA2-tpi promoter (a hybrid of the promoters from the genes forAspergillus niger neutral alpha-amylase and Aspergillus oryzae triosephosphate isomerase), and mutant, truncated, and hybrid control regionsthereof. Exemplary yeast cell control regions can be from the genes forSaccharomyces cerevisiae enolase (ENO-1), Saccharomyces cerevisiaegalactokinase (GAL1), Saccharomyces cerevisiae alcoholdehydrogenase/glyceraldehyde-3-phosphate dehydrogenase (ADH2/GAP), andSaccharomyces cerevisiae 3-phosphoglycerate kinase.

In some embodiments, exemplary control regions for insect cells include,among others, those based on polyhedron, PCNA, OplE2, OplE1, Drosophilametallothionein, and Drosophila actin 5C. In some embodiments, insectcell promoters can be used with Baculoviral vectors.

In some embodiments, exemplary control regions for plant cells include,among others, those based on cauliflower mosaic virus (CaMV) 35S,polyubiquitin gene (PvUbi1 and PvUbi2), rice (Oryza sativa) actin 1(OsAct1) and actin 2 (OsAct2) control regions, the maize ubiquitin 1(ZmUbi1) control region, and multiple rice ubiquitin (RUBQ1, RUBQ2,rubi3) control regions.

In some embodiments, the expression vector contains one or moreselectable markers, which permit selection of transformed cells. Aselectable marker is a gene the product of which provides for biocide orviral resistance, resistance to heavy metals, prototrophy to auxotrophs,and the like. Examples of bacterial selectable markers are the dal genesfrom Bacillus subtilis or Bacillus licheniformis, or markers, whichconfer antibiotic resistance such as ampicillin, kanamycin,chloramphenicol (Example 1) or tetracycline resistance. Suitable markersfor yeast host cells are ADE2, HIS3, LEU2, LYS2, MET3, TRP1, and URA3.Selectable markers for use in a filamentous fungal host cell include,but are not limited to, amdS (acetamidase), argB (ornithinecarbamoyltransferase), bar (phosphinothricin acetyltransferase), hph(hygromycin phosphotransferase), niaD (nitrate reductase), pyrG(orotidine-5′-phosphate decarboxylase), sC (sulfate adenyltransferase),and trpC (anthranilate synthase), as well as equivalents thereof.Embodiments for use in an Aspergillus cell include the amdS and pyrGgenes of Aspergillus nidulans or Aspergillus oryzae and the bar gene ofStreptomyces hygroscopicus.

In some embodiments, it may be desirable to modify the polypeptides ofthe present invention. One of skill will recognize many ways ofgenerating alterations in a given nucleic acid construct to generatevariant polypeptides Such well-known methods include site-directedmutagenesis, PCR amplification using degenerate oligonucleotides,exposure of cells containing the nucleic acid to mutagenic agents orradiation, chemical synthesis of a desired oligonucleotide (e.g., inconjunction with ligation and/or cloning to generate large nucleicacids) and other well-known techniques (see, e.g., Gillam and Smith,Gene 8:81-97, 1979; Roberts et al., Nature 328:731-734, 1987, which isincorporated by reference in its entirety for all purposes). In someembodiments, the recombinant nucleic acids encoding the polypeptides ofthe invention are modified to provide preferred codons which enhancetranslation of the nucleic acid in a selected organism.

The polynucleotides of the invention also include polynucleotidesincluding nucleotide sequences that are substantially equivalent to thepolynucleotides of the invention. Polynucleotides according to theinvention can have at least about 80%, more typically at least about90%, and even more typically at least about 95%, sequence identity to apolynucleotide of the invention. The invention also provides thecomplement of the polynucleotides including a nucleotide sequence thathas at least about 80%, more typically at least about 90%, and even moretypically at least about 95%, sequence identity to a polynucleotideencoding a polypeptide recited above. The polynucleotide can be DNA(genomic, cDNA, amplified, or synthetic) or RNA. Methods and algorithmsfor obtaining such polynucleotides are well known to those of skill inthe art and can include, for example, methods for determininghybridization conditions which can routinely isolate polynucleotides ofthe desired sequence identities.

Nucleic acids which encode protein analogs or variants in accordancewith this invention (i.e., wherein one or more amino acids are designedto differ from the wild type polypeptide) may be produced using sitedirected mutagenesis or PCR amplification in which the primer(s) havethe desired point mutations. For a detailed description of suitablemutagenesis techniques, see Sambrook et al., Molecular Cloning: ALaboratory Manual, Cold Spring Harbor Laboratory Press, Cold SpringHarbor, N.Y. (1989) and/or Current Protocols in Molecular Biology,Ausubel et al., eds, Green Publishers Inc. and Wiley and Sons, N.Y(1994), each of which is incorporated by reference in its entirety forall purposes. Chemical synthesis using methods well known in the art,such as that described by Engels et al., Angew Chem Intl Ed. 28:716-34,1989 (which is incorporated by reference in its entirety for allpurposes), may also be used to prepare such nucleic acids.

In some embodiments, amino acid “substitutions” for creating variantsare preferably the result of replacing one amino acid with another aminoacid having similar structural and/or chemical properties, i.e.,conservative amino acid replacements. Amino acid substitutions may bemade on the basis of similarity in polarity, charge, solubility,hydrophobicity, hydrophilicity, and/or the amphipathic nature of theresidues involved. For example, nonpolar (hydrophobic) amino acidsinclude alanine, leucine, isoleucine, valine, proline, phenylalanine,tryptophan, and methionine; polar neutral amino acids include glycine,serine, threonine, cysteine, tyrosine, asparagine, and glutamine;positively charged (basic) amino acids include arginine, lysine, andhistidine; and negatively charged (acidic) amino acids include asparticacid and glutamic acid.

The nucleic acid of the present invention can be linked to anothernucleic acid so as to be expressed under control of a suitable promoter.The nucleic acid of the present invention can be also linked to, inorder to attain efficient transcription of the nucleic acid, otherregulatory elements that cooperate with a promoter or a transcriptioninitiation site, for example, a nucleic acid comprising an enhancersequence, a polyA site, or a terminator sequence. In addition to thenucleic acid of the present invention, a gene that can be a marker forconfirming expression of the nucleic acid (e.g. a drug resistance gene,a gene encoding a reporter enzyme, or a gene encoding a fluorescentprotein) may be incorporated.

When the nucleic acid of the present invention is introduced into a cellex vivo, the nucleic acid of the present invention may be combined witha substance that promotes transference of a nucleic acid into a cell,for example, a reagent for introducing a nucleic acid such as a liposomeor a cationic lipid, in addition to the aforementioned excipients.Alternatively, a vector carrying the nucleic acid of the presentinvention is also useful. Particularly, a composition in a form suitablefor administration to a living body which contains the nucleic acid ofthe present invention carried by a suitable vector is suitable for invivo gene therapy.

Host Cells

In some embodiments, nucleic acids encoding an immune binding protein ofthe invention (e.g., an antibody) are cloned into an appropriateexpression vector for expression of immune binding protein in a hostcell. In some embodiments, the host cells of the invention include, forexample, bacterial, fungi, or mammalian host cells. In some embodiments,the host cell is a bacterium, including, for example, Bacillus, such asB. lichenformis or B. subtilis; Pantoea, such as P. citrea; Pseudomonas,such as P. alcaligenes; Streptomyces, such as S. lividans or S.rubiginosus; Escherichia, such as E. coli; Enterobacter; Streptococcus;Archaea, such as Methanosarcina mazei; or Corynebacterium, such as C.glutamicum.

In some embodiments, the host cells are fungi cells, including, but notlimited to, fungi of the genera Saccharomyces, Klyuveromyces, Candida,Pichia, Debaromyces, Hansenula, Yarrowia, Zygosaccharomyces, orSchizosaccharomyces. In some embodiments, the host cell is a fungi,including, among others, Saccharomyces cerevisiae, Schizosaccharomycespombe, Kluyveromyces lactis, Kluyveromyces marxianus, Aspergillusterreus, Aspergillus niger, Pichia pastoris, Rhizopus arrhizus, Rhizobusoryzae, Yarrowia lipolytica, and the like. In some embodiments, theeukaryotic cells are algal, including but not limited to algae of thegenera Chlorella, Chlamydomonas, Scenedesmus, Isochrysis, Dunaliella,Tetraselmis, Nannochloropsis, or Prototheca. In some embodiments, thealgae is a green algae, red algae, glaucophytes, chlorarachniophytes,euglenids, chromista, or dinoflagellates.

In some embodiments, the eukaryotic cells are mammalian cells, such asmouse, rat, rabbit, hamster, porcine, bovine, feline, or canine. In someembodiments, the mammalian cells are cells of primates, including butnot limited to, monkeys, chimpanzees, gorillas, and humans. In someembodiments, the mammalians cells are mouse cells, as mice routinelyfunction as a model for other mammals, most particularly for humans(see, e.g., Hanna, J. et al., Science 318:1920-23, 2007; Holtzman, D. M.et al., J Clin Invest. 103(6):R15-R21, 1999; Warren, R. S. et al., JClin Invest. 95: 1789-1797, 1995; each publication is incorporated byreference in its entirety for all purposes). In some embodiments, animalcells include, for example, fibroblasts, epithelial cells (e.g., renal,mammary, prostate, lung), keratinocytes, hepatocytes, adipocytes,endothelial cells, and hematopoietic cells. In some embodiments, theanimal cells are adult cells (e.g., terminally differentiated, dividingor non-dividing) or embryonic cells (e.g., blastocyst cells, etc.) orstem cells. In some embodiments, the animal cell is a cell line derivedfrom an animal or other source.

In some embodiments, the mammalian cell is a cell found in thecirculatory system of a mammal, including humans. Exemplary circulatorysystem cells include, among others, red blood cells, platelets, plasmacells, T-cells, natural killer cells, B-cells, macrophages, neutrophils,or the like, and precursor cells of the same. As a group, these cellsare defined to be circulating eukaryotic cells of the invention. In someembodiments, the mammalian cells are derived from any of thesecirculating eukaryotic cells. The present invention may be used with anyof these circulating cells or cells derived from the circulating cells.In some embodiments, the mammalian cell is a T-cell or T-cell precursoror progenitor cell. In some embodiments, the mammalian cell is a helperT-cell, a cytotoxic T-cell, a memory T-cell, a regulatory T-cell, anatural killer T-cell, a mucosal associated invariant T-cell, a gammadelta T cell, or a precursor or progenitor cell to the aforementioned.In some embodiments, the mammalian cell is a natural killer cell, or aprecursor or progenitor cell to the natural killer cell. In someembodiments, the mammalian cell is a B-cell, or a plasma cell, or aB-cell precursor or progenitor cell. In some embodiments, the mammaliancell is a neutrophil or a neutrophil precursor or progenitor cell. Insome embodiments, the mammalian cell is a megakaryocyte or a precursoror progenitor cell to the megakaryocyte. In some embodiments, themammalian cell is a macrophage or a precursor or progenitor cell to amacrophage.

In some embodiments, a source of cells is obtained from a subject. Thesubject may be any living organism. Examples of subjects include humans,dogs, cats, mice, rats, and transgenic species thereof. In someembodiments, T cells can be obtained from a number of sources, includingperipheral blood mononuclear cells, bone marrow, lymph node tissue, cordblood, thymus tissue, tissue from a site of infection, ascites, pleuraleffusion, spleen tissue, and tumors. In some embodiments, any number ofT cell lines available in the art, may be used. In some embodiments, Tcells can be obtained from a unit of blood collected from a subjectusing any number of techniques known to the skilled artisan, such asFicoll separation. In some embodiments, cells from the circulating bloodof an individual are obtained by apheresis. The apheresis producttypically contains lymphocytes, including T cells, monocytes,granulocytes, B cells, other nucleated white blood cells, red bloodcells, and platelets. In some embodiments, the cells collected byapheresis may be washed to remove the plasma fraction and to place thecells in an appropriate buffer or media for subsequent processing steps.In some embodiments, the cells are washed with phosphate buffered saline(PBS). In an alternative aspect, the wash solution lacks calcium and maylack magnesium or may lack many if not all divalent cations. Initialactivation steps in the absence of calcium can lead to magnifiedactivation.

In some embodiments the plant cells are cells of monocotyledonous ordicotyledonous plants, including, but not limited to, alfalfa, almonds,asparagus, avocado, banana, barley, bean, blackberry, brassicas,broccoli, cabbage, canola, carrot, cauliflower, celery, cherry, chicory,citrus, coffee, cotton, cucumber, eucalyptus, hemp, lettuce, lentil,maize, mango, melon, oat, papaya, pea, peanut, pineapple, plum, potato(including sweet potatoes), pumpkin, radish, rapeseed, raspberry, rice,rye, sorghum, soybean, spinach, strawberry, sugar beet, sugarcane,sunflower, tobacco, tomato, turnip, wheat, zucchini, and other fruitingvegetables (e.g. tomatoes, pepper, chili, eggplant, cucumber, squashetc.), other bulb vegetables (e.g., garlic, onion, leek etc.), otherpome fruit (e.g. apples, pears etc.), other stone fruit (e.g., peach,nectarine, apricot, pears, plums etc.), Arabidopsis, woody plants suchas coniferous and deciduous trees, an ornamental plant, a perennialgrass, a forage crop, flowers, other vegetables, other fruits, otheragricultural crops, herbs, grass, or perennial plant parts (e.g., bulbs;tubers; roots; crowns; stems; stolons; tillers; shoots; cuttings,including un-rooted cuttings, rooted cuttings, and callus cuttings orcallus-generated plantlets; apical meristems etc.). The term “plants”refers to all physical parts of a plant, including seeds, seedlings,saplings, roots, tubers, stems, stalks, foliage and fruits.

Applications

In some embodiments, the immune binding proteins of the invention areused in therapies for infectious diseases, cancer, allergies, andautoimmune diseases. In some embodiments, the methods of the inventionare used to make repertoires of immune binding proteins from subjectsthat have been challenged/infected with an infectious agent. In someembodiments, the immune binding proteins of the invention are used intherapies to treat subjects infected with an infectious agent. In someembodiments, the immune binding proteins of the invention are used totreat subjects with cancer or allergies. In some embodiments, the immunebinding proteins of the invention are used to treat melanoma, lymphoma,leukemia and other cancers responsive to immune therapy. In someembodiments, the immune binding proteins of the invention are used totreat cancers that respond to immune checkpoint inhibitor therapy. Insome embodiments, addition of exogenous immune binding protein (e.g.,antibody) helps the subject's body accelerate its own immune response toa pathogen, in effect “transplanting” the immunity from one individualto another. In some embodiments, the immune binding proteins of theinvention are used prophylactically. In some embodiments, the immunebinding proteins of the invention are used in diagnostic applications.In some embodiments, the immune binding proteins of the inventionprovide information on a subject's response to a therapy. In someembodiments, the immune binding proteins of the invention provideinformation on a subject's response to an antibody therapy, smallmolecule drug therapy, biologic therapy, or cellular immunotherapy.

In some embodiments, immune binding proteins (e.g., antibodies) can beobtained from the subject that neutralize an infectious agent or can bemade to become neutralizing. In some embodiments, the infectious agentis a bacterial strain of Staphylococci, Streptococcus, Escherichia coli,Pseudomonas, or Salmonella. In some embodiments, the infectious agent isa Staphylococcus aureus, Neisseria gonorrhoeae, Streptococcus pyogenes,Group A Streptococcus, Group B Streptococcus (Streptococcus agalactiae),Streptococcus pneumoniae, and Clostridium tetani. In some embodiments,the infectious agent is a bacterial pathogen that may infect host cellsincluding, for example, Helicobacter pyloris, Legionella pneumophilia, abacterial strain of Mycobacteria sps. (e.g. M. tuberculosis, M. avium,M. intracellulare, M. kansaii, or M. gordonea), Neisseria meningitides,Listeria monocytogenes, R. rickettsia, Salmonella spp., Brucella spp.,Shigella spp., or certain E. coli strains or other bacteria that haveacquired genes with invasive factors. In some embodiments, theinfectious agent is a bacterial pathogen that is antibiotic resistant.In some embodiments, the infectious agent is a viral pathogen including,for example, Ebola, Zika, RSV, Retroviridae (e.g. human immunodeficiencyviruses such as HIV-1 and HIV-LP), Picornaviridae (e.g. poliovirus,hepatitis A virus, enterovirus, human coxsackievirus, rhinovirus, andechovirus), rubella virus, coronavirus, vesicular stomatitis virus,rabies virus, ebola virus, parainfluenza virus, mumps virus, measlesvirus, respiratory syncytial virus, influenza virus, hepatitis B virus,parvovirus, Adenoviridae, Herpesviridae [e.g. type 1 and type 2 herpessimplex virus (HSV), varicella-zoster virus, cytomegalovirus (CMV), andherpes virus], Poxviridae (e.g. smallpox virus, vaccinia virus, and poxvirus), or hepatitis C virus.

In some embodiments, immune binding proteins of the invention are usedto boost the immunity of a subject against an infectious disease. Forexample, in influenza the body responds within 7-10 days to a challenge;however, in immunocompromised patients such as the elderly, the immuneresponse timing or extent may be insufficient to fight off theinfection, resulting in severe complications and possibly death. Byboosting the immune system with antibodies designed to fight therelevant strain of influenza, the infection in the subject can treated.In some embodiments, the methods of the invention are used to rapidlydevelop strain-specific antibodies to emerging pandemic strains ofinfluenza. In some embodiments, immune binding proteins of the inventionare used to treat infected patients and/or passively immunize vulnerablepopulations facing an outbreak. In some embodiments, the immune bindingproteins are administered prophylactically. In some embodiments, theprophylactic administration of the immune binding proteins protect atrisk groups of subjects from a disease.

In some embodiments, the infectious agent is a herpes simplex virus 1(HSV-1), herpes simplex virus 2 (HSV-2), varicella zoster, Epstein-Barr,cytomegalovirus (CMV), or Kaposi's sarcoma viruses. HSV-1 primarilycauses oral herpes, ocular herpes, and herpes encephalitis, andoccasionally causes genital herpes; HSV-2 primarily causes genitalherpes but can also cause oral herpes; varicella zoster causeschickenpox and shingles; Epstein-Barr causes mononucleosis and isassociated with several cancers including Burkitt's lymphoma; CMV causesmononucleosis-like syndrome and congenital/neonatal morbidity andmortality. Some of the herpesviridae, and in particular HSV-1, have beenassociated with and proposed as causative agents for Alzheimer'sDisease. In some embodiments, immune binding proteins of the inventioncan be used to treat and/or passively immunize against theseherpesviridae. In some embodiments, an injection or topical applicationof an antibody against HSV-1 or HSV-2 can be employed to reduce theincidence or severity of the effects of herpes outbreaks.

In some embodiments, the immune binding proteins of the invention areuseful for treating a cancer. In some embodiments, the cancer is asarcoma, carcinoma, melanoma, chordoma, malignant histiocytoma,mesothelioma, glioblastoma, neuroblastoma, medulloblastoma, malignantmeningioma, malignant schwannoma, leukemia, lymphoma, myeloma,myelodysplastic syndrome, myeloproliferative disease. In someembodiments, the cancer is a leukemia, lymphoma, myeloma,myelodysplastic syndrome, and/or myeloproliferative disease. In someembodiments, the cancer is one that is responsive to immunotherapy. Insome embodiments, the cancer is one that is responsive to immunecheckpoint inhibitor therapy.

In some embodiments, the immune binding proteins of the invention arespecific for a tumor specific or enriched antigen. In some embodiments,examples of tumor specific or enriched antigens include, for example,one or more of 4-1BB, 5T4, adenocarcinoma antigen, alpha-fetoprotein,BAFF, B-lymphoma cell, C242 antigen, CA-125, carbonic anhydrase 9(CA-IX), C-MET, CCR4, CD152, CD19, CD20, CD21, CD22, CD23 (IgEreceptor), CD28, CD30 (TNFRSF8), CD33, CD4, CD40, CD44 v6, CD51, CD52,CD56, CD74, CD80, CEA, CNTO888, CTLA-4, DR5, EGFR, EpCAM, EphA3, CD3,FAP, fibronectin extra domain-B, folate receptor 1, GD2, GD3ganglioside, glycoprotein 75, GPNMB, HER2/neu, HGF, human scatter factorreceptor kinase, IGF-1 receptor, IGF-I, IgG1, L1-CAM, IL-13, IL-6,insulin-like growth factor I receptor, alpha 5β1-integrin, integrinαvβ3, MORAb-009, MS4A1, MUC1, mucin CanAg, N-glycolylneuraminic acid,NPC-1C, PDGF-Rα, PDL192, phosphatidylserine, prostatic carcinoma cells,RANKL, RON, ROR1, SCH 900105, SDC1, SLAMF7, TAG-72, tenascin C, TGF β2,TGF-β, TRAIL-R1, TRAIL-R2, tumor antigen CTAA16.88, VEGF-A, VEGFR-1,VEGFR2, 707-AP, ART-4, B7H4, BAGE, β-catenin/m, Bcr-abl, MN/C IXantibody, CAMEL, CAP-1, CASP-8, CD25, CDC27/m, CDK4/m, CT, Cyp-B, DAM,ErbB3, ELF2M, EMMPRIN, ETV6-AML1, G250, GAGE, GnT-V, Gp100, HAGE,HLA-A*0201-R170I, HPV-E7, HSP70-2M, HST-2, hTERT (or hTRT), iCE, IL-2R,IL-5, KIAA0205, LAGE, LDLR/FUT, MAGE, MART-1/melan-A, MART-2/Ski, MC1R,myosin/m, MUM-1, MUM-2, MUM-3, NA88-A, PAP, proteinase-3, p190 minorbcr-abl, Pml/RARα, PRAME, PSA, PSM, PSMA, RAGE, RU1 or RU2, SAGE, SART-1or SART-3, survivin, TPI/m, TRP-1, TRP-2, TRP-2/INT2, WT1, NY-Eso-1 orNY-Eso-B or vimentin.

In some embodiments, the tumor antigen-binding immune binding protein(e.g., antibody) can be used to make a chimeric antigen receptorspecific for the tumor antigen and this CAR construct is placed into a Tcell and/or a natural killer cell. In some embodiments the T-cell and/ornatural killer cells with the tumor specific CAR are used to treatsubjects with cancers that bear the tumor antigen.

In some embodiments, the immune binding proteins of the invention areuseful for treating subjects with allergies. Common allergens includeshellfish, nuts, milk, ollen, certain medications, latex, insect bites,and some plant compounds (e.g. urushiol). In some embodiments, theimmune binding proteins of the invention bind the allergen of interestwithout triggering the allergic reaction. For example, the immunebinding protein could be an antibody without an Fc region, or could bean antibody in an IgG format or other format that is not an IgE format.In these embodiments, the immune binding protein of the invention bindsto the allergen without triggering an allergic reaction and this bindingcan prevent IgE antibody in the subject from binding to the allergen andcausing the allergic reaction (this is a competitive inhibitionreaction). In some embodiments, the immune binding protein which bindsthe allergen is obtained from the subject with the allergy.

In some embodiments, the immune binding proteins of the invention areuseful for treating subjects with autoimmune diseases. In someembodiments, the autoimmune disease is rheumatoid arthritis, lupus,celiac disease, Sjorgren's syndrome, polymyalgia rheumatica, multiplesclerosis, ankylosing spondylitis, Type 1 diabetes, and the like. Insome embodiments, the immune binding proteins of the invention bind theantigen target of the autoimmune disease without triggering theautoimmune reaction. For example, the immune binding protein could be anantibody without an Fc region, or could be an antibody in a format thatdoes not interact with the effector cells that are associated with theautoimmune disease. In these embodiments, the immune binding protein ofthe invention binds to the autoimmune antigen without triggering anautoimmune reaction and this binding can prevent the subject's immunesystem from reacting with the autoimmune antigen reducing the autoimmunedisease (this can be a competitive inhibition reaction).

All publications and patents cited in this specification are hereinincorporated by reference as if each individual publication or patentwere specifically and individually indicated to be incorporated byreference and are incorporated herein by reference to disclose anddescribe the methods and/or materials in connection with which thepublications are cited.

EXAMPLES Example 1. Multiplexed Antigen Staining of Primary Cells

In some embodiments, barcoded peptide antigens are prepared byincubating antigens with an NHS DBCO heterobifunctional crosslinker.Secondly a DNA oligo with a 5′ primer site, a DNA barcode, a 3′ primersite, a 3′ poly dt, and containing a 3′ biotin and a 5′ azide are mixedwith the peptide-DBCO antigens to make bar code labeled antigens.

In some embodiments, human B cells with membrane bound receptors areisolated using magnetic separation. Cells are incubated with the mixtureof bar code labelled antigens so that labelled antigens bind membranebound immunoglobulin receptors. The cells are washed and optionally thecells may be FACS sorted after incubating them with a streptavidin-PEfluorophore. In some embodiments, the cells are then single cell sortedinto plates containing a Triton based lysis mixture and poly-dt primerwith a 5′ amplification tag. In some embodiments, a reversetranscription reaction is performed with a template switching reversetranscriptase, a template switch primer and appropriate buffer and dNTPmixture. The cDNA library with barcoded antigen is amplified with KAPAHifi and primers specific to the amplification tag and the templateswitch sequence. In some embodiments, specific regions of interest, suchas the heavy and light chain CDR regions and the antigen barcode, areamplified with primers containing a well-specific barcode and a 3′primer to the region of interest via PCR. In some embodiments, thesefragments are used to generate a sequencing library for high throughputsequencing. After sequencing, the data is de-convoluted byidentification of well specific barcodes, sequence assembly of heavy andlight chain reads and identification of reads with antigen barcodes.

Example 2. Multiplexed Antigen Library Sequencing Using Beads

A pool of B-cells bound to antigens is made as described in Example 1.In some embodiments, following antigen staining and washing, cells areseparated with a monodisperse droplet generator. In some embodiments,the droplets comprise lysis/binding mix containing one or more barcodedpoly-dt capture beads (beads coated with a DNA primer containing a 5′amplification tag and a 3′ poly dT sequence) in a high salt/detergentbuffer and 1-10 cells. In some embodiments, an oil phase is used to makean emulsion which oil phase is a perflurocarbon containing anamphiphilic fluorous/aqueous surfactant. As the cells lyse, their RNA iscaptured on the barcoded poly-dt beads as is the barcoded antigen DNA.In some embodiments, the emulsion is broken under stringent bindingconditions, such as with methylene chloride and 6×SSC buffer. The beadmixture is washed twice and resuspended in a reverse transcriptasereaction and incubated. In some embodiments, the beads (“templatebeads”) are separated in another water/oil emulsion generated with amonodisperse droplet generator so that each droplet has about one“template bead” in a PCR mixture. The PCR mixture also contains one ormore “prep” beads containing beads that are coated with primerscontaining a 5′ amplification tag and a bead specific barcode. In someembodiments, the primers have a 3′ poly dA, some have a 3′ antigenprimer, some have a 3′ heavy chain reverse primer, and some have a 3′light chain reverse primer. In some embodiments, the aqueous phase has5′ heavy chain primers, 5′ light chain primers, 5′ antigen primers andthe 5′ amplification tag from the poly dT capture beads. Kapa Hifi is asuitable polymerase for this amplification. In some embodiments,following PCR the emulsion is broken and a high throughput sequencinglibrary is generated. Following sequencing, all reads associated withthe last round PCR barcodes are split into pools. Then, cell-specificbarcodes are identified by the reads associated with the polyA/5′amplification tag. In some embodiments, all reads associated with beadscontaining the same cell-specific barcodes are grouped together. In someembodiments, these groups are used to provide the sequence oridentification of the heavy chain, light chain and the antigen whichassociate together.

Example 3. Multiplexed Antigen Library Sequencing Using 5′5′ Primers

A 5′5′ primer is made by mixing a 5′ DBCO oligonucleotide and a 5′ azideoligonucleotide. In some embodiments, the DBCO and azide do not need tobe at the precise 5′ end of the component oligos but may be placed in amanner that still allows for the 3′ end to perform a PCR reaction. Thecombined product is isolated from unreacted component oligos. In someembodiments, it may be higher yielding to use these 5′5′ primers insteadof beads for linking reads to cell-specific barcodes. In someembodiments, a reaction uses primers containing a 5′5′ linkage with oneof the 3′ ends containing a polyA and the other containing a 3′ light,3′ heavy or 3′ antigen tag. In some embodiments, the reaction mix alsocontains 5′ heavy, 5′ light and 5′ antigen and 5′ amplification tagprimers with 5′ phosphate groups. In some embodiments, an emulsion isgenerated with a monodisperse emulsion generator so that each dropletcontains about 1 “template bead” and the 5′5′ primer mixture and KAPAhifi in a suitable buffer with dNTP's, etc. Following emulsion PCR, theemulsion is broken with methylene chloride, the aqueous phase extractedand cleaned. The DNA obtained is resuspended in ligation buffer withligase. In some embodiments, the DNA obtained after ligation is treatedwith exonuclease(s). In some embodiments, the mixture obtained afterexonuclease treatment is placed into a PCR with KAPA hifi for 20 cycleswith the 3′ polyA primer, 3′ heavy primer and 3′ light primer. In someembodiments, the PCR product is used as a template to generate asequencing library which is sequenced on a high throughput sequencer.Following sequencing, the reads are grouped according to theircell-specific barcode and then reads for heavy, light and antigen areidentified.

Example 4. Multiplexed Gene Specific Bead Libraries with PCR

In some embodiments, bead libraries are made where each bead has primerscontaining a bead specific barcode, molecule specific barcode and aplurality of gene specific primers. In some embodiments, MyOnecarboxylate dynabeads are first coated with a 5′ amplification primersequence. The beads are incubated with a limited dilution of DNA primerscontaining the reverse complement amplification sequence at the 3′ end,a unique molecular barcode comprising 12 N residues, and an adaptersequence of 12 bases (for example the M13 sequencing primer sequence).After incubating the beads with this mixture, the beads are pelleted andwashed, and then placed in a Klenow exo-polymerase reaction. The beadsare then pelleted and washed.

In some embodiments, the beads are resuspended into a PCR mixturecontaining polymerase and a soluble version of the reverse complementadapter sequence and placed into an oil/aqueous emulsion using amonodisperse droplet generator. Following droplet generation, theemulsion is cycled for 30× and then broken. The beads are then reactedwith T7 exonuclease. Then the beads are placed in a reaction mixturecontaining oligonucleotides with the 3′ reverse complement adaptersequence, a 10 base random DNA sequence, and a 5′ adapter 2 sequence anda klenow exo-polymerase mix. In some embodiments, following thisreaction, the beads are treated with T7 exonuclease. In an alternateembodiment, heat is used to remove the second strand instead of anuclease. In some embodiments, the beads are then placed in a mixturecontaining capture primers with a 3′ reverse complement of adapter 2,and a 5′ reverse complement of the 3′ heavy, 3′ light chain and 3′antigen tag are added with a klenow exo-polymerase mix. In someembodiments, following this reaction the beads are treated with T7exonuclease.

Example 5. Multiplexed Gene Specific Bead Libraries with Ligation

In some embodiments, bead libraries are made where each bead has primerscontaining a bead specific barcode, molecule specific barcode and aplurality of gene specific primers. In some embodiments, MyOnecarboxylate dynabeads are first coated with a 5′ amplification primersequence with a 5′ amino moiety. The beads are then incubated with alimited dilution of DNA primers containing the reverse complementamplification sequence at the 3′ end, a unique molecular barcodecomprising 12 N residues, and an adapter sequence of 12 bases (forexample the M13 sequencing primer sequence). After incubating the beadswith this mixture, the beads are treated with Klenow exo-polymerase. Insome embodiments, the beads are then mixed with a soluble version of thereverse complement adapter sequence and placed into an oil/aqueousemulsion using a monodisperse droplet generator. Following dropletgeneration, the emulsion is cycled for 30× and then broken. The beadsare placed in a mixture with double stranded DNA sequence with theforward strand containing a 5′ phosphate, 10 base random DNA sequence,and the 3′ heavy primer, 3′ light chain primer or 3′ antigen tag primerat the 3′ end. The mixture also contains T4 DNA ligase. After thisreaction, the beads are treated with T7 exonuclease.

Example 6. Preparation of B Cells with Membrane Bound Receptors

In some embodiments, it may be beneficial to increase the receptordensity on cells. In some embodiments, primary B cells are transformedinto antibody secreting plasma cells by incubation with IL21, IL4, andCD40L. These cells are treated with an NHS-azide heterobifunctionalcrosslinker. Protein-G DBCO is prepared by mixing protein G with anNHS-DBCO heterobifunctional crosslinker. The cells are treated with theprotein-G DBCO with additional protein-G and then spatially separated ina microwell array with soluble or solid phase protein-G in the buffer.The cells are centrifuged or gravity settled into the bottom of eachwell. The cells are removed from the microwell by centrifugation orgravity and placed in a solution with a metabolic inhibitor such aspresent in many commonly available stain buffers. Following thistreatment, the cells are reacted with antigens.

Example 7. Preparation of B Cells with Hydrogel Bound Receptors

In some embodiments, it may be beneficial to further increase thereceptor density on the antigen binding cells. In some embodiments,primary B cells are transformed into antibody secreting plasma cells byincubation with IL21, IL4, and CD40L. The cells are treated with anNHS-azide heterobifunctional crosslinker and then spatially separated ina microwell array. The cells in the microwells are treated with an DBCO4× dendrimer PEG, and then treated with an azide-azide homobifunctional1 kd PEG. In some embodiments, the DBCO 4× dendrimer PEG treatment andthe homobifunctional azide-azide 1 kda PEG treatment are repeated for adesired number of rounds. These additional cycles of DBCO/azide pegscreate additional functionalization sites and larger hydrogel volume forbetter signal until a desired amount of functionalization and/orhydrogel is produced. In some embodiments, Protein-G DBCO is prepared bymixing protein G with an NHS-DBCO heterobifunctional crosslinker. Thecells embedded in hydrogel are treated with the protein-G DBCO withadditional protein-G. The cells are removed from the microwell andplaced in a solution with a metabolic inhibitor such as present in manycommonly available stain buffers. The cells are ready for reaction withantigens. Alternatively the cell/hydrogel mixture is left in the wellarray and stained in situ with antigens.

Example 8. Preparation of B Cells with Magnetic Bead Bound Receptors

In some embodiments, primary B cells are transformed into antibodysecreting plasma cells by incubation with IL21, IL4, and CD40L. Thecells are treated with an NHS-azide heterobifunctional crosslinker andwashed.

In some embodiments, Protein-G beads are prepared by activating magneticcarboxylated beads with EDC/sulfo NHS and reacting with protein G.Protein-G DBCO beads are prepared by mixing protein G beads with anNHS-DBCO heterobifunctional crosslinker. The cells are spatiallyseparated in a microwell array. Protein G DBCO beads are added to eachwell. In some embodiments, soluble azide PEG and soluble protein G isalso added to the wells. In some embodiments, beads with antibodies areremoved from the microwell by magnetic separation, centrifugation orgravity and placed in a solution with a metabolic inhibitor such aspresent in many commonly available stain buffers. The antibody beads arethen reacted with antigen. Alternatively the cell/bead mixture is leftin the well array and stained in situ with antigens.

Example 9. Multiplexed ScFv Generation Using 5′5′ Primers

In some embodiments, cDNA made from individual cells as described aboveis isolated in a microwell array or emulsion in a mixture containing alibrary of linked 5′5′ primers, where one side is specific to the 5′coding frame of the heavy chain variable sequence and one side isspecific to the 3′ coding frame of a light chain variable sequence.Additionally the PCR mix contains Kapa Hifi polymerase and a primerlibrary for light chain 5′ variable regions and heavy chain 3′ variableregions. The DNA obtained from the reaction is ligated with T4 ligaseand then treated with exonuclease. This mixture is placed into a PCRwith KAPA hifi for 20 cycles with the 3′ heavy primer library and 5′light primer library. Following PCR this material is cloned into asuitable expression vector for production of proteins containing an ScFvfragment. Alternatively or in addition the combined ScFv DNA library isused to make a sequencing library for high throughput sequencing.

Example 10. Micropatterned Surface for Capture of Single Beads or Cells

In some embodiments, a glass surface is cleaned before spin coating withAZ4620 positive resist. In some embodiments, the surface is patternedwith a chrome mask in a manner that exposes 1 um spots 50 um center tocenter. In some embodiments, the resist is developed and the surface issilanized with APTES, and then treated with NHS-PEG-azide. In someembodiments, the resist is stripped with acetone and DMSO, and thesurface is then treated with a 10 kd PEG-silane mixed with fluorooctyltriethoxy silane to passivate the surface. Varying the ratio ofPEG-silane to fluorooctyl silane will produce passivated surfaces withdifferent qualities. In some embodiments, the ratio is 1:1000PEG-silane:fluorooctyl silane in order make the surface slightlyhydrophobic at the non-cell capture positions without contributing tonon-specific interactions of cellular proteins to the passivatedsurface. In some embodiments, cells or beads are functionalized with NHSDBCO, washed and introduced to the array surface in a binding buffercontaining 2% BSA, 1 mM EDTA and 25 mM HEPES via pipetting or syringepump and allowed to incubate under gentle agitation or fluid flow. Thearray is then washed.

Example 11. Micropatterned Surface Capture of Specific Cell Types

In some embodiments, the micropatterned array from example 10 is used tocapture cells displaying a specific surface marker, for example humanB-cells. In some embodiments, an anti-CD19 antibody containing a DBCOmoiety (an antibody functionalized with DBCO-NHS) is added to a mixtureof peripheral blood mononucleocytes in a suitable stain buffer (egPBS-BSA), washed and introduced to the array surface in a suitablebinding buffer and allowed to incubate under gentle agitation or fluidflow. The array is then washed and captured B-cells are contained in thearray.

Example 12. Transfer of Cells from Micropatterned Surface to Microwellsfor Subsequent Reactions

Examples 10-11 it may be modified to spatially isolate the buffersurrounding cells or beads for additional reactions. In someembodiments, a microwell array made of PDMS with wells of size 30 um×30um×100 um 50 um center to center deep are pre-filled with a mixturecontaining poly-dT beads and a weak lysis/binding mixture (for example,a mixture containing LiCl and a detergent such as 1% Triton or 0.5%Tween-20). This microwell array is placed against the micropatternedcell array from examples 10-12 in order to register 1 cell to 1 well inthe array. Lysis occurs in the microwell and mRNA is captured on thepoly dT beads.

Example 13. Transfer of Beads from Micropatterned Surface to Microwellsfor Subsequent Reactions

In this embodiment, beads containing cDNA libraries and barcoded antigentags from single cells are transferred to a microwell array as inExample 12. In some embodiments, a microwell array made of PDMS withwells of size 30 um×30 um×100 um 50 um center to center deep isprefilled containing a PCR mixture that contains primers for amplifyingparticular molecules of interest such as in Examples 2 and 3. The PCRreaction occurs in the microwell array.

Example 14. Spatial Confinement of Single Cells for the Capture ofAntibodies on the Cellular Surface

In some embodiments, single cells (cells displaying antibodies) arebound for subsequent capture on their surface. In this embodiment, cellsare captured on a micropatterned surface similar to examples 10-12. Insome embodiments, the spots for the micropatterned surface arederivatized with NHS-PEG-desthiobiotin and following treatment with aPEG-silane the surface is treated with a solution containingstreptavidin in the cell binding buffer. The cells are functionalizedwith NHS-PEG-desthiobiotin and NHS-PEG-Azide before introduction to thearray. Therefore a linkage between the cells and the surface is made viaa desthiobiotin-streptavidin-desthiobiotin interaction. In someembodiments, a microwell array containing a mixture of protein G-DBCO isregistered and placed against the cell array to spatiallycompartmentalize each cell for capture of antibodies secreted from thecell to its surface. After incubation, the microwell array is removedand the cells washed with a solution containing free protein G and ametabolic inhibitor. The cells may be stained with antigen in situ andreleased by the introduction of biotin, or released by the introductionof biotin and then stained with antigen subsequently as mentioned inprevious examples.

Example 15. Transfer of Cells or Beads from Micropatterned Surface toMicrowell Array

In some embodiments, Example 14 is modified to transfer the cells in aregistered manner to the microwell array for subsequent reactions. Inthis example, the cells are transferred to the microwell array. Themicrowell array is loaded with a solution containing biotin. Uponregistration/placement of the microwell array to the micropatternedsurface, the cells are released from the surface due to the displacementof desthiobiotin by biotin. The cells may be moved to the bottom of thewell array by centrifugation or other means. This step may be repeatedto load additional cells or beads in a similar manner so that a precisenumber of cells or beads are added to each position in the microwellarray.

Example 16. Precise Registration of Single Bead and Single Cell in OneStep

In some embodiments, a micropatterned surface with approximately 400 nMfeatures is used to capture beads of size 1 um. This is similar toExample 10 using a PEG-Azide on the surface to capture a PEG-DBCO on thebeads. In some embodiments, cells are added that have beenfunctionalized with a PEG-Azide. Since the bead sizes are much smallerthan the cell diameter, approximately one cell is captured by each beadwhich is captured by the micropatterned surface. In this manner aprecise one bead to one cell ratio on a registered grid is possible thatcan be used for subsequent workup. In this example the bead is alsofunctionalized with a barcoded polydT capture probe for mRNA captureupon lysis of the single cell in a confined space, such as in Example13. The bead/cell combo can also be released for placement in anemulsion as in Example 2.

All publications and patents cited in this specification are hereinincorporated by reference as if each individual publication or patentwere specifically and individually indicated to be incorporated byreference and are incorporated herein by reference to disclose anddescribe the methods and/or materials in connection with which thepublications are cited.

Those skilled in the art will recognize, or be able to ascertain usingno more than routine experimentation, many equivalents to the specificembodiments of the invention described herein. Such equivalents areintended to be encompassed by the following claims.

1. (canceled)
 2. (canceled)
 3. (canceled)
 4. (canceled)
 5. (canceled) 6.(canceled)
 7. (canceled)
 8. (canceled)
 9. (canceled)
 10. (canceled) 11.(canceled)
 12. (canceled)
 13. (canceled)
 14. (canceled)
 15. (canceled)16. (canceled)
 17. (canceled)
 18. (canceled)
 19. (canceled) 20.(canceled)
 21. A method for making a nucleic acid, comprising the stepsof: providing a plurality of host cells wherein the host cells comprisea plurality of nucleic acids encoding a plurality of antibody heavychains and encoding a plurality of antibody light chains, wherein theplurality of host cells express a plurality of different antibodies, andwherein each antibody can bind an antigen; binding some of the pluralityof host cells to a plurality of different antigens wherein each antigenhas a unique bar code that identifies the antigen; isolating individualhost cells bound to one of the antigens; adding a first primer set, asecond primer set, a third primer set, a plurality of dNTPs, and apolymerase to the host cells that have bound the antigens, wherein thefirst primer set comprises a forward primer and a reverse primer,wherein one of the primers of the first set has an overlap extensionregion for a nucleic acid encoding the antibody light chain, the otherprimer has an overlap extension region for the nucleic acid encoding thebar code, wherein the first primer set amplifies a nucleic acid encodingthe antibody heavy chain, wherein the second primer set comprises aforward primer and a reverse primer, wherein one of the primers of thesecond set has an overlap extension region for the nucleic acid encodingthe antibody heavy chain, wherein the second primer set amplifies thenucleic acid encoding the antibody light chain, wherein the third primerset comprises a forward primer and a reverse primer, wherein one of theprimers of the second set has an overlap extension region for thenucleic acid encoding the antibody heavy chain, wherein the secondprimer set amplifies the nucleic acid encoding the bar code; reactingthe first primer set, the second primer set, the third primer set, thedNTPs, the polymerase, the nucleic acid encoding the antibody heavychain, the nucleic acid encoding the antibody light chain, and the barcode whereby a single nucleic acid is obtained that encodes the antibodylight chain, the antibody heavy chain, and the bar code.
 22. The methodof claim 21, wherein the host cell is in an emulsion.
 23. The method ofclaim 21, wherein the host cell is in a well.
 24. The method of claim21, wherein the cell is in a tube.
 25. The method of claim 21, whereinthe plurality of cells are in an array of a plurality of drops held bysurface tension to a surface.
 26. The method of claim 21, wherein theplurality of cells are spatially isolated by means of a hydrogel. 27.The method of claim 21, wherein each of the plurality of differentantigens is attached to a plurality of beads whereby each differentantigen is attached to a different bead.
 28. The method of claim 27,wherein each unique barcode for each different antigen is attached tothe beads for that antigen.
 29. The method of claim 21, wherein theantibody is selected from the group consisting of a scFv, a Fab, aF(ab′)2, a Fab′, a Fv, and a diabody.
 30. The method of claim 21,wherein the antibody is selected from the group consisting of an IgG, anIgM, an IgA, an IgD, and an IgE.
 31. The method of claim 21, wherein thehost cell is selected from the group consisting of a B-cell, a plasmacell, a B memory cell, a pre-B-cell and a progenitor B-cell.
 32. Themethod of claim 21, wherein the plurality of different antigens are froma virus.
 33. The method of claim 32, wherein the virus causes aninfectious disease.
 34. The method of claim 33, wherein the virus is acoronavirus.
 35. The method of claim 21, wherein the host cells areobtained from a subject that mounted a protective immune responseagainst a virus.
 36. The method of claim 35, wherein the virus is acoronavirus.
 37. The method of claim 21, further comprising the step ofplacing the nucleic acid encoding the antibody light chain, the antibodyheavy chain, and the bar code into a vector.
 38. The method of claim 21,further comprising the step of sequencing the nucleic acid encoding theantibody light chain, the antibody heavy chain, and the bar code into avector.
 39. The method of claim 37, further comprising the step ofsequencing the nucleic acid encoding the antibody light chain, theantibody heavy chain, and the bar code into a vector.
 40. The method ofclaim 37, further comprising the step of placing the vector into abacterial cell.